per 



World intellectual property organization 

International Bureau 




INTERNATIONAL APPLICATION PUBLISHED UNDER THE PATENT COOPERATION TREATY (PCI) 



(51) International Patent Classification 5 
C12P 21/00, C12N 15/00 



Al 



(11) International Publication Number: WO 91/02077 

(43) International Publication Date: 21 February 1991 (21.02.91) 



(21) Internationa] Application Number: PCT/US90/04239 

(22) International Filing Date: 30 July 1990 (30.07.90) 



(30) Priority data: 
386,053 



28 July 1989 (28.07.89) 



US 



(71) Applicant: THE UNITED STATES OF AMERICA, repre- 

sented by THE SECRETARY, UNITED STATES DE- 
PAR^OENT OF COMMERCE [US/US]; Washington, 

(72) Inventors: MIKI, Tom ; 261 Congressional Lane, Apt T- 

19, Rockville, MD 20852 (US). AARONSON, Stuart, A. 
; 1006 Harriman Street, Great Falls, VA 22066 (US) 
FLEMING, Timothy ; 8151 Needwood Road, Apt. 104, 
Derwood, MD 20855 (US). 



(74) Agents: OLIFF, James, A. et a!; Oliff & Berridge, P.O 
Box 19928, Alexandria, VA 22320 (US). _ 

(81) Designated States: AT (European patent), AU, BE (Euro- 
pean patent), CA, CH (European patent), DE (Euro, 
pean patent)*, DK (European patent), ES (European pa- 
tent), FR (European patent), GB (European patent), IT 
(European patent), JP, LU (European patent), NL (Eu- 
ropean patent), SE (European patent). 

Published 

With international search report. 
Before the expiration of the time limit for amending the 
claims and to be republished in the event of the receipt of 
amendments. 



(54) Title: EFFICIENT DIRECTIONAL GENETIC CLONING SYSTEM ~~ 
(57) Abstract 

mRNA A c ?ffL?S i Cneti ? Cl ?T g T em is . discl0Sed Which is P articular *y **ful ^ cloning cDNA copies of eukaryotic 
vec^NA bl^l^rT r 118 m ? aSmid C ° mp0Site VCCt0rs lar * e Zoning capacities. Cleavage ofLh 
DMA tv J Si if reSt ? Ctl0n MB. for example, creates two different non-symmetrical 3' extensions at the ends of vector 

^^XXSa^ 6 m ?T\ CDNA iS Prepared t0 have diflferent ends whi <* can be ligated to iose of 
£ ^E25£^ fr r ag .r^ and thC V6Ct0 ^ DNAS are mixed - both the ™ lecuIes can assemble without self-circulariza- 
tion due to base-painng speofiorty This system provides (1) high cloning efficiency (107-108 clones/g poly (A)+ RNA) (2) low 
background (more than 90 % of the clones contain inserts), (3) directional insertion of cDNA fm^nte mto tutors (Z 

EZ? nl? fr^ T n m CaCh (5) «f ^serts (up to 1 0 kb), (6) a m^hanism for Lcue of tie 

plasmid part from a X genome, and (7) a straightforward protocol for library preparation. 



* See back of page 



DESIGNATIONS OF "DE" 



Until further notice, any designation of "DE" in any international application 
whose international fifing date is prior to October 3, 1990, shall have effect in the 
territory of the Federal Republic of Germany with the exception of the territory of the 
former German Democratic Republic. 



FOR THE PURPOSES OF INFORMATION ONLY 

Codes used to identify States party to the PCT on the front pages of pamphlets publishing international 
applications under the PCT. 



AT 


Austria 


BS 


Spain 


MC 


Monaco 


AU 


Australia 


Fl 


Roland 


MG 


Madagascar 


BB 


Barbados 


PR 


Prance 


ML 


Mali 


BE 


Belgium 


CA 


Gabon 


MR 


Mauritania 


BF 


Burkina Fasso 


GB 


United Kingdom 


MW 


Malawi 


BG 


Bulgaria 


CR 


Greece 


NL 


Netherlands 


BJ 


Benin 


HU 


Hungary 


NO 


Norway 


BR 


Brazil 


IT 


Italy 


PL 


Poland 


CA 


Canada 


JP 


Japan 


RO 


Romania 


CP 


Central African Republic 


KP 


Democratic People's Republic 


SO 


Sudan 


CG 


Congo 




of Korea 


SB 


Sweden 


CH 


Switzerland 


KR 


Republic of Korea 


SN 


Senegal 


CM 


Cameroon 


U 


Liechtenstein 


SU 


Soviet Union 


DB 


Germany 


LK 


Sri Lanka 


TD 


Chad 


DK 


Denmark 


LU 


Luxembourg 


TC 


Togo 










US 


United States of America 



\ WO 91/02077 PCI7US90/04239 

1 

EFFICIENT DIRECTIONAL GENETIC CLONING SYSTEM 



BACKGROUND OF THE INVENTION 
5 ~ — ~ * " " — 

FIELD OP THE TNVENTTQN 

The present invention relates to vectors 
for molecular cloning of DNA segments, particularly 
to cloning vectors employing non-symmetrical 

10 restriction enzyme recognition sites for insertion 
of DNA segments in a defined direction relative to 
vector. This invention also relates to use of such 
vectors in methods for efficient cloning of genomic 
DNA segments and of complementary DNA (cDNA) copies 

15 of messenger RNA (mRNA) molecules from eukaryotic 
genes, and to the manufacture and use of novel 
products related to these vectors and methods. 

BACKGROUND OF THE INVENTION 

The development of DNA cloning techniques 
20 for complementary DNA (cDNA) copies of messenger RNA 
(mRNA) molecules has been of great value in the 
study of eukaryotic genes. In many cases, the 
amount of a given mRNA for which cDNA clones are 
desired is limited by the availability of 
25 appropriate tissue sources and/ or a low 

concentration of that specific mRNA in those 
sources. Therefore, readily obtainable sources may 
provide only a few copies of a given mRNA molecule 
from which cDNA clones might be produced. 
30 The requirements for any efficient method 

for cDNA cloning may be generally summarized as 
follows: first, full-length double-stranded cDNAs 



WO 91/02077 PCT/US90/04239 

'i 2 

must be produced from the mRNA with high yield; the 
ends of the resulting DNA fragments must be made 
capable of being joined efficiently to the vector 
DNA by enzymatic ligation; production of undesirable 
5 ligation byproducts must be minimized; and, 

preferably, insertion of the cDNA into the vector 
DNA should provide expression of the cDNA to 
facilitate detection of the desired clone by means 
of the product. 

10 Production of the protein product may be 

necessary for detecting a gene when no nucleic acid 
probes for the desired gene are available. More 
generally, such expression of the protein is 
desirable because, in terms of copy number, the 

15 protein provides a molecular signal that is greatly 
amplified in relation to the DNA molecules of the 
cloned gene inside the host cell. 

As it is difficult to achieve high 
efficiency of conversion of mRNA molecules into 

20 full-length cDNA clones, especially when the mRNA of 
interest is relatively long, several refinements in 
cDNA cloning strategy have been made. Among them, 
the Okayama-Berg method significantly improved the 
efficiency of full-length cDNA cloning. 

25 The Okayai&a-Berg approach has several 

advantages over previous, conventional methods for 
cloning cDNAs. The following section is intended to 
highlight these advantages in relation to the main 
steps of this complicated method. For a more 

30 complete and detailed description of the method, see 
the original publication [Okayama, H. and Berg, P. 
(1982) Mol. Cell. Biol. 2, 161-170], which is hereby 
incorporated herein by reference. 

The main advantages of the Okayama-Berg 

35 method for cDNA clone relate to the fact that as 
part of the processing needed to form mRNAs, 
transcripts of eukaryotic genes undergo enzymatic 
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addition of multiple adenosine residues at the 3 1 
end, thereby acquiring what is known as a "poly (A) 
tail 11 * In the present context, the term mRNA 
encompasses any RNA species from any source, natural 
or synthetic, having a 3* poly (A) tail comprising 
two or more adenosine residues. 

In the original Okayama-Berg approach, 
synthesis of the first DNA strand from the mRNA 
template is initiated by annealing the 3» poly (A) of 
the eukaryotic mRNA to an oligo(dT) primer which 
forms an extension of one end of a DNA strand of the 
cloning plasmid. First strand cDNA synthesis by 
this "plasmid-priming" method directs the 
orientation of the sequence within the cDNA into a 
unique relationship with the sequence in the 
plasmid; hence, this approach has been called 
"directional" cloning. Directional cloning ensures 
that every cDNA clone that is formed will be 
correctly oriented for a promoter provided in the 
cloning plasmid (an SV40 promoter in the original 
Okayama-Berg system) to drive transcription of the 
proper cDNA strand to produce RNA with the correct 
sense for translation into the protein encoded by 
the original mRNA template. 

To provide high efficiency of ligation in 
cloning DNA segments in general, restriction 
nucleases are utilized to produce short single- 
stranded ends on the DNA that are complementary in 
base sequence to any other DNA end produced by the 
same enzyme. Accordingly, these single-stranded 
ends can anneal together by forming specific DNA 
base pairs, or, in the vernacular, they are 
"sticky". This annealing greatly enhances the rate 
of joining DNA segments by enzymatic ligation and 
further provides a means for selectively joining 
ends of segments treated with the same enzyme. 
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in the original Okayama-Berg method, after 
synthesis of the first cDNA strand, an oligo(dG) 
tail is attached enzymatically to the free end of 
the plasmid-primed cDNA, and then the plasmid is 
cleaved by a restriction enzyme (Hindlll) to produce 
a sticky end on the plasmid opposite to the end 
where the cDNA is attached. A short DNA fragment 
("linker") , which contains the SV40 promoter and has 
a cleaved Hindlll site on one end and oligo(dC) on 
the other, is then attached to the cDNA-plasmid 
molecule by ligation, to circularize the molecule. 

In other, more conventional methods a 
(synthetic) linker may also be used to clone cDNAs, 
but it is attached ,afte£ second strand DNA synthesis 
and further enzymatic repair which is necessary to 
form perfectly matcfcsd strands (i.e., a "flush" or 
"blunt" end) . To protect internal restriction sites 
of the double-stranded cDNA from cleavage by the 
restriction enzyme required to allow ligation of the 
vector and linker, prior to addition of the linker, 
the cDNA is methylated with the appropriate DNA 
modification system associated with the given 
restriction enzyme. However, such protection may 
not be absolute; thus, internal sites may be cleaved 
at some frequency due to an incomplete methylation 
reaction. In contrast, in the original Okayama- 
Berg method, this problem of internal cleavage of 
cDNAs is obviated by cleavage of fTindlll sites on 
the vector when the cDNA is represented as an 
RNA; DNA hybrid that resists restriction. 

The_Okayama-Berg approach provides yet 
another advantage over previous methods in which 
both ends of separately synthesized cDNAs are 
ligated to the vector ends at the same time, namely 
that according to Okayama-Berg, the necessary 
circularization of the vector DNA with the cDNA 
attached at one end is relatively efficient via the 
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linker because only one juncture between the cDNA 
and vector molecules remains to undergo ligation. 

Furthermore, the overall Okayama-Berg 
approach offers additional advantages over previous 
methods. Following circular ization, a process 
called "RNA nick translation" using DNA polymerase I 
and RNase H is used which facilitates complete 
synthesis of the second strand along the entire 
first strand. This process overcomes the inherently 
low processivity of DNA polymerase I by using 
multiple sites for priming of second strand DNA 
synthesis with DNA primer fragments having random 
sequences. 

Finally, since the Okayama-Berg vector has 
already been joined to the cDNA when the second 
strand is synthesized, truncation of cDNA molecules 
close to the 3« end of the cDNA generally does not 
occur, in contrast to other methods in which the 
second strand is completed while the 3« end of the 
first strand is free and, therefore, more 
susceptible to damage from nuclease activities. 

Cloning vectors based on bacteriophage A 
are also known. The second strand synthesis 
reaction of the Okayama-Berg method has also been 
utilized in a simpler cloning procedure [Gubler, U. 
and Hoffman, B. J. (1983) Gene 25, 253-269], 
allowing cDNA cloning in such A vectors [Huynh, 
T.V. , Young, R. A. and Davis, R.w. (1985) in DNA 
Cloning, a Practical Approach, ed. Glover, D. (IRL, 
Oxford), Vol. I, pp. 49-78]. This A-based cDNA 
cloning method has been widely used, mainly due to 
the high efficiency of transmission of recombinant 
DNA into cells by means of infectious phage 
particles, which are produced with in vitro DNA 
"packaging" systems. A phage cloning systems also 
offer, convenient clone screening capabilities due to 
tolerance of a high density of A plagues on test 
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plates to be screened, compared with most plasmid 
systems which permit only lower densities of 
bacterial host colonies. 

Early A systems for cDNA cloning, however, 
while retaining the second strand synthesis strategy 
of the original OJcayama-Berg plasmid method, lack 
some of its other advantages. For example, 
directional cloning is not possible in those 
original A systems; In addition, multiple inserts 
and truncated cDNAs are frequently obtained. 
Further, despite the high packaging efficiency for 
native A DNA molecules, the packaging efficiency of 
recombinant DNA molecules that are produced by 
cleavage of intact linear A molecules and ligation 
with cDNA fragments is usually low compared to that 
of intact A DNA. 

Recently, directional cloning capabilities 
have been introduced into various A vectors. For 
example, one such directional A vector employs a 
site for insertion of DNA segments that comprises 
two different restriction enzyme cleavage sites 
[Meissner, P. Si , et al. (1987) Proc. Nat. Acad. 
Sci. USA, 84> 4171-4175]. The cDNA molecules are 
primed with oligo(dT) , made double-stranded, and 
then methylated with the enzymes needed for 
protection against internal cleavage by both of the 
nucleases used in the DNA insertion site of the 
vector. A linker segment containing a cleavage site 
for only one of the nucleases of the insertion site 
is added to both ends of the cDNA. The combination 
of the last two A:T base pairs on the 3' end of the 
cDNA with the sequences at one end of the linker, 
however, creates a cut site for the other of the two 
nucleases of the insertion site. Thus, after 
restriction with both nucleases of the insertion 
site, the individual cDNA segments can ligate into 
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the vector only in a single direction with respect 
to the two different cleavage sites in the vector. 

Various general disadvantages of this 
particular approach for cDNA cloning in A phage, 
compared to the Okayama-Berg plasmid method, have 
been described above in relation to other systems; 
and other problems specific to this approach have 
been noted [Meissner, P. s., et al. (1987), supra}. 
Nevertheless, it was reported that one cDNA library 
constructed by this method, starting from 5 ng of 
mRNA, contained about 2 x io* clones with 8 of 10 
having cDNA inserts (i.e., the reported cloning 
efficiency was about 3 x io 7 recombinants per ng of 
poly (A) + RNA) . 

Directional cloning in other A phage 
vectors has also been reported [Palazzolo, M. j. and 
Meyerowitz, E. M. (1987) Gene 52, 197-206]. [These 
vectors are known as ASWAJ or AG EM, certain variants 
of which (LambdaGEM w 2 and LambdaGEM w 4) are 
commercially available from Promega Corporation of 
Madison, Wisconsin. The AGEM type of vectors are 
also examples of a composite vector comprising both 
a A phage genome and an embedded plasmid (GEM)). 
The directional cloning scheme in these A vectors 
utilizes two different restriction enzyme cleavage 
sites at the site for insertion of DNA. Thus, for 
example, to attach the end of a cDNA corresponding 
to the poly (A) end of the mRNA to a particular end 
of the cleaved vector DNA that has a sticky end for 
the restriction enzyme Sad, a synthetic DNA 
"linker-primer" segment is used which combines a 
single-stranded oligo(dT) primer with a restriction 
site for the enzyme Sacl. After second strand 
synthesis, a linker segment with the site of a 
second restriction enzyme is ligated to the other 
end of the cDNA, which is then restricted with both 
enzymes of the insertion site of the vector, 
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according to- roach thfe same strategy as described for 
the previous example of a directional A phage 
vector. 

This particular approach for directional 
5 cloning in a A vector, however, cannot be used to 

obtain full-length cDNAs of certain mRNAs because it 
requires cleavage of the cDNA molecules by the 
restriction enzyme Sad and a second enzyme (e.g., 
XJbal) without first protecting the internal sites 

10 for these enzymes by appropriate methylation. [In 
an alternative version of the scheme reported by 
Palazzolo arid Meyerowitz, supra, the X2>al enzyme was 
replaced by JJcoRI and the cDNA was methylated to 
protect against only this one enzyme.] Sites for 

15 these particular enzymes occur frequently by chance 
in natural nucleotide sequences. Thus, restriction 
of cDNAs with enzymes like these, as taught in this 
approach, causes truncation of cDNA inserts with 
internal Sael (and/or Xbal) sites. In relation to 

20 cloning efficiency, it may be noted that this 
publication described a single cDNA library 
constructed by this method, starting from 1 of 
poly (A) + RNAf, that contained about 1.6 x io 6 clones 
with cDNA inserts. 

25 In addition to the publications on 

directional Jloning systems described above, there 
is a report which describes a non-directional 
plasmid-based system that uses an efficient 
oligonucleotide-based strategy to promote cDNA 

30 insertion into the vector [Aruffo, A. and Seed, B. 
(1987) Proc. Nat. Acad. Sci. USA 84, 8573-8577]. 
This method uses synthetic DNA adaptors that encode 
a recognition site for a particular restriction 
enzyme, BsftXX, which has a variable recognition 

35 sequence, as illustrated below: 
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5 1 -CCANNNNNNTGG-3 1 
3 • -GGTN2W1WWACC-5 » 

where A, T, G and C indicate nucleotides having the 
DNA bases adenine, thymine, guanine, and cytosine, 
5 respectively (for which the pairs A:T and G:C are 
complementary) , and N and N represent bases that are 
included within the recognition site sequence but 
that can be any of the usual DNA bases, provided 
only, of course, that each N and the corresponding N 

10 on the opposite DNA strand be complementary. The 
arrows (I and t) indicate the cleavage sites on the 
upper and lower DNA strands, respectively. 
Accordingly, cleavage of the BstXI site creates a 
4 -base single-stranded extension (sticky end) on the 

15 3 1 end that varies from site to site. 

The report above discloses a plasmid 
vector with a site for insertion of DNA segments in 
which two identical BstXI sites were placed in 
inverted orientation with respect to each other and 

20 were separated by a short replaceable segment of 
DNA. Inversion of a DNA sequence consists of 
representing the base sequence of each strand, 
conventionally expressed in the 5* to 3' direction 
of the polynucleotide backbone, in a DNA strand with 

25 the same base sequence presented in the 3' to 5' 
direction (e.g, inversion of the DNA sequence 5 1 - 
ACTG-3' produces the DNA sequence S'-ACTG-S* or, in 
the conventional 5' to 3' format, S^GTCA-O 1 . 

With the particular BstXI recognition 

30 sequence that was employed in this vector, the 

4-base single- stranded ends of the inverted sites 
created on the two ends of the vector DNA by 
restriction with the BstXI enzyme were not able to 
anneal with one another. This situation is 

35 illustrated below, where two identical sites, one 
inverted relative to the other and separated by an 
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unspecified sequence (N...N), are shown; the sticky 
ends of the vector produced by cleavage with the 
BstXI enzyme are shown in bold print: 

5 1 - (vector) CCANTGTGNTGG (N.JI) CCANCACANTGG (vector) -3 
3 1 - (vector) GGTNACACN&CC (N„N) GGTNGTCTNACC (vector) -5 

(Note that the reference does not specify the 

entire BstXI recognition sequence that was used; 

only the sequence of the sticky end is clearly 

defined, as indicated below by inclusion of the N 

symbol where necessary) . 

Inspection of these single-stranded end 

sequences on this plasmid vector reveals that 

they are identical, due to the inversion of one 

of the sites relative to the other. Thus, the 

ends of the vector with inverted and non-inverted 

copies of this particular BstXI restriction site 

sequence cannot anneal with each other. 

Similarly, the restricted ends of the spacer DNA 

segment between these two sites will be 

identical. Accordingly, to clone cDNA segments 

in this vector, a synthetic adaptor was attached 

to each end at the double-stranded stage, by 

blunt end ligation, giving them the same termini 

as the replaceable segment that was removed from 

the vector with BstXI. The specific adaptor used 

in the above report comprises the following 

oligonucleotide sequences: 

5 i -CTTTAGAGCACA-3 " 
3 1 -GAAATCTC-5 1 . 

Obviously, addition of this single adaptor to 

both ends of the cDNA segments would provide 

those segments with ends (in bold type) that 

could anneal and subsequently ligate efficiently 

to both identical vector ends. 

Thus, Aruffo and Seed, 1987, supra, 

discloses a method using this particular BstXI 

recognition site sequence, whereby neither the 
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cDNA (with attached adaptors) nor the isolated 
vector DNA (after being freed from the 
replaceable segment after cleavage with BstXI) 
was able to ligate to itself. This work, 
5 however, neither teaches nor suggests general 

requirements for a BstXI recognition sequence, or 
for those of other restriction enzymes, to be 
usable in this cloning approach. 

Further, as these workers pointed out, 

10 their strategy did not provide a directional 

cloning capability. After first alleging that 
such directional capability was not needed, they 
admitted that, nonetheless, they had devoted 
considerable unsuccessful efforts to developing 

15 an alternative means of producing mRNA from every 
cDNA clone, namely a bidirectional transcription 
capability whereby both strands of an inserted 
cDNA would be transcribed. They concluded that 
this goal cannot be easily attained, at least not 

20 in their cloning host system. The authors 

stated, moreover, that they could obtain cloning 
efficiencies with their plasmid that were between 
0.5 and 2 x 10 6 recombinants per /xg of mRNA, which 
were said to compare favorably with those 

25 described for certain cloning systems based on 
phage A. In the only example of a cDNA library 
described in this reference, however, the yield 
of cDNA clones obtained by this method was 
actually stated to be only » 3 x io 5 recombinants 

30 from 0.8 Mg poly (A) -containing RNA (i.e., less 
than 0.4 x io 6 recombinants per /xg poly(A)- 
containing RNA) . 

Thus, there has been a continuing need 
for methods and vectors which would provide a 

35 higher yield of cDNA clones from limited amounts 
of eukaryotic mRNAs while also providing an 
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improved means of directing orientation of 
inserted cDNA fragments within vector DNAs. 

SUMMARY Of TB5 INVENTION 

The present invention contemplates the 
5 application of methods of recombinant DNA 
technology to fulfill the above needs for 
increased efficiencies in DNA cloning systems 
and, in particular, to develop new means for 
directional insertion of cDNA fragments into 

10 cloning vectors. 

More specifically, it is an object of 
the present ^invention to provide means for 
directing assembly of insert DNAs into vector 
DNAs to form a unique, predetermined recombinant 

15 structure having the desired number and 

orientation of each Heeded DNA fragment, so that 
the number of resulting clones containing single 
inserts , as well as the probability of obtaining 
a full-length clone from each mRNA molecule, are 

20 enhanced. 

Further, it is an object of this 
invention to provide a cDNA cloning system which 
combines the features of this highly efficient 
cloning strategy with advantageous features of A 

25 phage vectors to overcome limitations of the . 
presently available A cloning systems* 

Accordingly, the present invention 
relates to highly efficient means for inserting 
DNA segments into cloning vectors in a defined 

30 orientation, and a method for using such means 
that is referred to herein as the "automatic 
directional cloning (ADC)" method. Novel DNA 
vectors and DNA segments are also included. 
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Appreciation of the operation and 
advantages of this invention requires further 
analysis of the problems in the prior approaches. 
The understanding of these problems by the 
present inventors lead to development of this 
invention. 

The present invention has been developed 
in light of recognition by these inventors of 
major sources of the limitations on cloning 
efficiency with the present systems designed for 
directional cloning, that they all employ 
restriction enzymes with recognition site 
sequences having one or both of the following 
disadvantages: they are either too short or they 
have a particular type of symmetry called "dyad" 
symmetry. 

As noted above, present A phage vectors 
for directional cloning of cDNAs suffer 
inefficiencies due in part to their use of 
restriction enzymes with recognition sequences 
that occur frequently in natural DNA sequences. 
Some problems relating to this issue might be 
solved by choosing a restriction enzyme with an 
infrequently occurring site (i.e., a longer 
recognition sequence which, by chance, would 
occur less frequently in random natural DNA 
sequences) . 

However, even when modified to utilize 
an infrequently cutting restriction enzyme, the 
present implementations of directional cloning in 
A have a drawback that is common to any cloning 
scheme using restriction enzymes with recognition 
sequences having dyad symmetry of the sticky ends 
produced by cleavage with the enzyme. 

Typical restriction enzymes with 
recognition site sequences having dyad symmetry 
make staggered cuts in the two opposing DNA 
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strands at symmetrical points surrounding the 
center of a dyad pattern. Cleavage by this type 
of enzyme produces short single-stranded ends 
which are complementary in base sequence to those 
5 of any oth^r DNA fragment produced by cleavage 
with the same enzyme* 

Fqv example, the recognition site of the 
commonly used restriction enzyme, EcoRI, consists 
of the following complementary sequences which 
10 when cleaved by the enzyme, produce the 4 -base 

extension of the 5 1 end of the DNA containing the 
dyad " TTAA " ; ( shown in bold face type) : 

* : j, 

5 1 -GAATTC-3 1 
3 , -C^MJIG-5• 
t 

15 where A, T, G and C indicate DNA bases, as 

described above for the BstXI site. Inspection 
of this EcoRI sticky end sequence readily reveals 
that inversion of this sequence produces its 
complement, namely " AATT" . Thus, any DNA end 

20 produced by EcoRI can anneal to any other such 

DNA end; and, therefore, any EcoRI sticky end can 
also be ligated efficiently to any other such 
end. Similarly, all DNA ends which are produced 
by any one restriction enzyme that generates 

25 sticky ends characterized by dyad symmetry are in 
the case of each sticky end sequence readily 
ligatable one to another. [Hereinafter, such 
" self -1 lettable" single-stranded ends of DNA that 
are producible by a restriction enzyme will be 

30 simply referred to as "symmetrical ends", and the 
enzymes that produce them, as "symmetrical 
restriction enzymes".] 

In light of this symmetrical nature of 
many restriction enzyme recognition sites, one of 

35 the major problems with existing directional A 

vectors cas be more fully appreciated. When the 



15 

two end fragments of cleaved A DNA (i.e., the so- 
called A "arms") are ligated with cDNA fragments, 
several products are produced, only some of which 
constitute the desired infectious DNA molecules 
containing cDNA inserts. For instance, consider 
the simplest case, when the ends on both the cDNA 
and on each of the "left" and "right" A arms (as 
the two A DNA arms have been designated in a 
genetic mapping convention) have been cut by the 
same symmetrical restriction enzyme. Here, 
linear structures other than those with the 
desired order (i.e., "left arm-cDNA insert-right 
arm") may form in significant amounts during 
ligation with cDNA fragments; and the cDNAs 
trapped in these other, nonviable structures 
cannot produce phage clones. These undesirable 
ligation byproducts may include self-ligation 
products of the two ends of individual vector or 
cDNA segments, consisting of circular DNAs. 
Ligation products in this instance may also 
comprise vector-cDNA combinations containing 
multiple inserts, which, even if viable, may 
create problems in expression or identification 
of original mRNA structure. 

On the other hand, when each end of the 
vector and insert cDNA molecule are ultimately 
produced by two different symmetrical restriction 
enzymes, as in the present directional A systems, 
these ends are then physically distinguishable in 
relation to the polarity of the encoded genetic 
information in each DNA segment, i.e., the 
matching of complementary sticky ends on vector 
DNAs and cDNAs results in the desired directional 
cloning of the cDNA insert relative to functional 
seguences in the vector (e.g., a promoter). 
Further, circularization due to self-ligation of 
cDNAs or vectors without inserts is eliminated by 
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the use of two different symmetrical restriction 
enzymes. 

Other undesirable ligation byproducts 
remain, however, in the usual two enzyme approach 
for directional cloning using symmetrical 
enzymes. Some of these are dimers of vector or 
cDNAs, which may be designated, for example, as 
"tail-to-tail" or 1, head-to-head" dimers. Thus, 
even when vector and cDNAs are made by cutting 
with two different symmetrical enzymes, head-to- 
head and tail-to-tail dimers are not eliminated, 
although the population of desired molecules is 
significantly higher. . 

In contrast to existing systems based on 
A phage, the* automatic directional cloning method 
does not permit cDNA or vector fragments to 
ligate to each other , ensuring the presence of a 
single insert in each clone, as well as higher 
cloning efficiencies and lower backgrounds of 
clones that do not contain cDNA inserts. 

To accomplish these goals, the present 
invention contemplates use of restriction enzymes 
which produce single-stranded ends that do not 

exhibit dyad symmetry (hereinafter referred to as 

" 

"non-symmetrical ends" and correspondingly, non- 
symmetrical recognition site sequences and 
nzymes) . Although certain preferred embodiments 
df the preserit invention employ derivatives of 
bacteriophage A as the vector, which further 
comprise embedded plasmid genomes, this invention 
can be practiced with any self -replicating DNA 
molecule (i.e., a "replicon") serving as the 
vector for -DNA cloning in any host in which the 
selected replicon can be replicated. 

Work cited in the Background above 
describes a plasmi£-based system that 
advantageously employs two identical BstXI 
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recognition site sequences, albeit in two 
different orientations. This single recognition 
sequence is non-symmetrical according to the 
definition in the present disclosure, although 
the reference does not describe the BstXI 
sequence in such terms or otherwise characterize 
this sequence as such. The present invention is 
clearly distinguishable from this previous 
approach, as described below. 

Use of Bstxi sites is not readily 
applicable to the A system, due to the existence 
of multiple BstXI recognition sites in the X 
phage genome, owing to the number of base pairs 
in the variable recognition sequence that are not 
allowed to vary (i.e., the "invariable base 
pairs" being only six) . 

Accordingly , in one aspect the present 
invention relates to a genetic cloning vector 
comprising at least one replicon, and a site for 
inserting DNA segments to be cloned that includes 
at least two non-symmetrical restriction enzyme 
recognition sequences that are identical, where 
the first of these identical recognition 
sequences is in the inverted orientation with 
respect to a second identical sequence; and, in 
addition, the identical restriction enzyme 
recognition sequences include greater than six 
positions . having invariable DNA base pairs. 
Recognition site sequences of the enzyme Sfil, 
for example, fulfill both the length and 
asymmetry requirements for this aspect of the 
invention, as will become evident below. 

On the other hand, in plasmid systems or 
other replicons lacking JBstXI sites, either 
naturally or due to genetic engineering, two 
BstXI site sequences that are in the same 
instance non-symmetrical and nonidentical can be 
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advantageously employed for efficient cloning of 
DNA segments according to the present invention. 

More generally, this aspect of this 
invention miy be practiced with any two non- 
5 symmetrical restriction enzyme recognition 

sequences that are not identical (recognitions 
site sequences of the enzyme Sfil, for instance) . 

Accordingly, the present invention also 
relates to a genetic cloning vector comprising at 

10 least one replicon, and a site for inserting DNA 
segments to be cloned that includes at least two 
non-symmetrical restriction enzyme recognition 
sequences that are nonidentical. In this aspect 
of the invention, two of the non-symmetrical 

15 restriction enzyme recognition sequences can be 
selected advantageously to be cleavable by a 
single restriction enzyme, for example, BstXI or, 
alternatively, Sfil; or each of two nonidentical 
restriction enzyme recognition site sequences may 

20 be selected to be cleavable by a different 

enzyme. Preferably, at least one of the non- 
symmetrical recognition sequences includes 
greater than six positions having invariable DNA 
base pairs; and most preferably, two nonidentical 

25 recognition sequences include greater than six 
positions having invariable DNA base pairs, as 
typified by Sfil recognition site sequences. 

The present invention further relates to 
a vector, as described above, in which the 

30 replicon comprises a form of bacteriophage A. 

The vector may advantageously further 
comprise regulatory elements located in relation 
to the site for insertion of DNA segments such 
that, when a DNA segment is inserted into this 

35 site, at least: a portion of the sequences of the 
DNA segment is transcribed. This portion may be 
derived from either one of the strands of the 
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inserted double-stranded DNA segment, or from 
both of these strands. 

In one major embodiment of this aspect 
of this invention, these regulatory elements in 
the vector consist of promoters that entirely 
originate from bacteriophage. By the phrase 
"originate from" it is meant that the regulatory 
element (e.g., promoter) is encoded in the genome 
of the instant organism or virus (e.g., 
bacteriophage) as it occurs in nature. It should 
be noted here that it is well known that, 
generally, promoters that originate from 
bacteriophage are not able to initiate 
transcription in eukaryotic hosts. This 
particular embodiment of the present invention is 
exemplified by two A-plasmid composite vectors, 
LambdaGEM^ll and LambdaGEM™12, which are 
commercially available from Promega Corporation 
of Madison, Wisconsin. 

According to available information at 
the time of the present disclosure, these 
particular LambdaGEM™ vectors apparently were 
first disclosed in the 1988/1989 Catalogue and 
Applications Guide for Biological Research 
Products published and distributed by Promega 
Corporation in August of 1988, the entirety of 
which is hereby incorporated herein by reference. 
The following excerpts of that catalog describe 
these particular vectors and some of their 
various uses, particularly those relating to 
transcription from bacteriophage promoters. 

Section 11. pace 5: 

The LambdaGEM-11 vector is a multi- 
functional genomic cloning vector designed for 
high resolution mapping of recombinant inserts, 
simplified genomic library construction, ultra- 
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low background of non^recombinants , and rapid 
genomic walking. This lambda replacement-type 
cloning vehicle contains the following features 
(Figure 2 [not shown]): dual opposed 
5 bacteriophage T7 and SP6 RNA polymerase 

promoters , flanking asymmetric Sfll restriction 
sites, and a multiple cloning site with 
strategically positioned Xhol and BamHI 
restriction sites. The LambdaGEM-11 vector also 

10 contains unique sites for Sad, Avrll, EcbRI, and 
XJbal. Because it is a derivative of EMBL3 (1) , 
DNA fragments ranging from 9-2 3kb can be cloned 
in the LambdaGEM-11 vector and the Spi phenotypic 
selection agaitist non-recombinants is available. 

15 The vector was designed to make use of the Sfll 
recognition sites flanking the promoters for the 
high resolution restriction mapping of insert DNA 
using the Sfl linker mapping system (Sec. 11 , 
pg. 8) . 

20 The T7/SP6 phage promoters simplify 

chromosomal "walking", as RNA probes synthesized 
from the extremities of the cloned insert can be 
used to search a library for overlapping 
sequences in either direction. In addition, the 

25 nucleotide sequence of the end of an insert 

cloned in the, LambdaGEM-11 vector can be obtained 
directly form the phage template by hybridizing 
an SP6 or T7 oligonucleotide primer, followed by 
a chain te£»ination sequencing reaction (2,3). 

30 Two cloning strategies for genomic 

library construction, using DNA partially 
digested with Mbol or 5au3AI, are available with 
the LambdaGEM-11 dephosphorylated BamHI arms. A 
new cloning strategy (4) relies on the exclusive 

35 specificity with which partially f illed-in Xhol 
LambdaGEM-11 arms (Xhol half-site arms) can be 
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combined with partially filled-in Sau3AI digested 
genomic DNA. The only ligation products possible 
are single copies of genomic inserts with 
appropriate arms, since the partial fill-in 
prevents self -ligation reactions of vector arms, 
central stuffer, and genomic fragments. This 
method also makes genomic DNA fractionation 
unnecessary, is very rapid (Figure 3 [not 
shown]), and requires small amounts of starting 
material. The Xhol and BamHI sites in the 
LambdaGEM-11 vector are strategically positioned 
6 and 11 bases, respectively, from the 
transcription initiation site of either promoter. 

As measured by in vitro packaging, 
recombinant efficiencies of 3 x 10 r pfu/^g DNA 
have been achieved using a test insert ligated to 
LambdaGEM-ll dephosphorylated SamHI or Xhol half- 
site arms using Packagene® lambda packaging 
extracts. The background for self-ligated arms 
alone is typically <ioo pfu/^g DNA in either 
case. This ultra-low background level of non- 
recombinant vector DNA has three important 
advantages: it eliminates the need for the Spi 
genetic selection against the parental vector, 
which is known to result in biased libraries (5) , 
non-productive ligation events are minimal, 
thereby resulting in larger genomic libraries, 
and fewer filters need be processed for screening 
a library. For detailed protocols describing the 
use of this vector, see Sec. li, pg. 12. 

Section 11. page fit 

The LambdaGEM-12 vector is a multi- 
functional genomic cloning vector designed for 
high resolution restriction mapping of 
recombinant inserts, simplified genomic library 



construction, ultra- low background of non- 
recombinants , and rapid genomic walking. This 
lambda replacement-type cloning vehicle contains 
the following features (Figure 4 [not shown]): 
dual opposed bacteriophage T7 and SP6 RNA 
polymerase promoters, RNA polymerase promoters 
[sic] , flunking asymmetric Sfil restriction 
sites, and a multiple cloning site with 
strategically positioned tfotl and BamHl 
restriction. sites. The LambdaGEM-12 vector also 
contains unique sites for Sad, BcoRI, Xhol, and 
Xbal. Because it is a derivative of EMBL3 (1) , 
DNA fragments ranging from 9-23kb can be cloned 
in the LambdaGEM-12 vector and the Spi phenotypic 
selection against non-recombinants is available. 

Accordingly, the present invention 
relates to a vector that is either LambdaGEM 11 
or LambdaGEM 12. Further details of the use of 
these vectors for restriction mapping of inserted 
DNA segments, according to the Sfi Linker Mapping 
System mentioned above, are extracted from the 
Promega catalog below. 

Section 11, pag? 9: 

The multi-functional LambdaGEM -11 and 
LambdaGEM -12 genomic cloning vectors have been 
engineered specifically for high resolution 
restriction mapping of recombinant inserts. The 
vectors, derivatives of EMBL3, possess 5/iI 
restriction sites flanking bacteriophage T7/SP6 
RNA polymerase promoters and a multiple cloning 
region (Sec. 11, pg. 5). The flanking Sfil 
restriction sites allow most inserts to be 
excised as a single fragment, since this 8-base 
recognition sequence occurs infrequently in 
genomic DNA (in theory, once every 65,536bp). 



23 

Sfil recognizes the interrupted 
palindrome GGCCNNNN/NGGCC and cleaves within the 
central unspecified sequence, leaving a 3-base 3 r 
overhang. The nucleotide sequence of the central 
region which becomes the overhanging termini thus 
may contain any of the four possible bases. The 
flanking Sfil sites in the vectors have been 
designed in an asymmetric fashion, so that the 
site on the left is distinct from the site on the 
right. Therefore, radiolabeled linkers 
complementary to either the left or right Sfil 
termini can be ligated separately to the Sfil 
excised genomic DNA. Once the insert has been 
asymmetrically labeled, a high resolution 
restriction map can be determined by partial 
digestion with a frequent cutting restriction 
endonuclease such as Sau3AI followed by gel 
electrophoresis and autoradiography (Figure 6 
[not shown]) . 

The mapping resolution of this method is 
an order of magnitude greater than conventional 
cos site oligo labeling, since only the ends of 
the centrally located insert are labeled instead 
of the ends of the 20kb and 9kb arms of the 
vector. The variable results generated from 
inaccurate size estimates of large restriction 
size fragments, as well as anomalous bands which 
result from the fusion of the insert with a 
vector fragment, are eliminated with this system. 
For a detailed protocol describing the use of 
this system, see Sec. 11, pg. 14. 

Still further, the present invention 
relates to a vector having nonidentical non- 
symmetrical restriction enzyme recognition site 
sequences, as described above, also including 
regulatory elements located such that the 
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sequences of an inserted DNA segment are 
transcribe^ as in the LambdaGEM™ vectors above, 
but where *tbe regulatory elements are at least 
partly of eiakaryotic origin. A principal 
embodiment of this aspect of the present 
invention ia exemplified by two A-plasmid 
composite vectors, ApCEVIS and ApCEV9, the 
structures t>f which are depicted in Figure 1 and 
described further below. 

In cloning operations with these two 
vectors, a DNA segment, a cDNA, for example, is 
cloned between two Sfil sites, A and B, as 
described in the section below relating to the 
automatic directional cloning method. The 
vectors are designed as eukaryotic expression 
vectors, utilizing the M-MLV LTR promoter, and 
they contain th6 SV40 early promoter-driven neo 
gene as a selectable marker. 

Thus, the present invention further 
relates to v a genetic cloning vector comprising at 
least one replicon, and a site for inserting DNA 
segments to be cloned that includes at least two 
nonidentical restriction enzyme recognition 
sequences that * are non-symmetrical, where the 
vector also includes a selectable marker that is 
functional in eukaryotic cells in which the 
vector can be replicated. The term "functional" 
as used here means that the gene for the marker 
is expressed and that a selection scheme for that 
marker is operable in these eukaryotic cells. 

1b these two particular exemplary 
vectors, the form of A-plasmid composite vector 
was chosen to take advantage of the efficient 
packaging and high density screening in A 
systems , a&d simpler DNA preparation and analysis 
in plasmiC' systems • After isolation of clones of 
interest, pCEV plasmids with cDNA inserts can be 
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obtained by NotI digestion of crude A DNA 
preparations and ligation followed by 
transformation of bacterial cells. The A 
genotype which supports healthy growth (red*, 
gam*) was chosen to maintain the intactness of 
inserts during the amplification step of the 
libraries. Deletion and/or insertion derivatives 
generated during the amplification step, if any, 
would not accumulate in the population, since 
they would not have a growth advantage. 

ApCEVlS has several advantages over 
ApCEV9 as follows: (1) ApCEVlS does not require 
the supF mutation in host cell due to the S + 
allele in the A genome. (2) ApCEV15 does not 
lysogenize host strains due to the deletion of 
the cl gene. (3) cDNA inserts in ApCEV15 can be 
cut out by Sail digestion. (4) ApCEV9 loses the 
functional SV40 promoter after the cDNA insert is 
cloned, while ApCEVlS does not. (5) ApCEVlS can 
accommodate longer cDNA inserts (up to 10.5 kb) 
than ApCEV9 (up to 8.5 kb). (6) ApCEVlS DNA 
contains a unique Hindlll site. 

ApCEV9 has at least two advantages over 
ApCEVlS. The ApCEV9 genome has a stuff er fragment 
between the two Sfil sites, which would be 
replaced by cDNA inserts during cloning process. 
Generally, it has been found that ApCEV9 cDNA 
libraries have lower backgrounds than ApCEVlS 
libraries, presumably because the presence of 
the stuffer fragment in ApCEV9 separates the two 
Sfil sites enough to ensure complete Sfil 
cleavage. It has also been observed that ApCEV9 
grows more stably without accumulation of 
fast-growing derivatives. This is likely due to 
the longer size of the genome compared to that of 
ApCEVlS. 
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In another aspect , the present invention 
relates to a method for cloning of DNA segments 
referred to as the automatic directional cloning 
method. In particular, the present invention 
relates to a method for cloning a cDNA copy of a 
eukaryotic mRNA, comprising the following steps 
(which are further illustrated in Figure 3 and in 
the Description of Specific Embodiments, below): 

(i) annealing a linker-primer DNA 
segment comprising a single-stranded 
oligonucleotide which has oligo(dT) at the 3 1 
end, and a single-stranded extension at the 5' 
end that is included in a first non-symmetrical 
restriction enzyme recognition sequence . 
[Note that this first recognition sequence is 
identical to one of two non-symmetrical sites in 
the vector that are used for direction cDNA 
cloning . ] 

4U / (ii) enzymatically synthesizing the 
first strand of the cDNA from the linker-primer 
that is annealed with the mRNA molecule; 
[Typically, this may be accomplished using a 
reverse transcriptase. During the first strand 
synthesis reactions, the single-stranded 
linker-primer is repaired so as to be double- 
stranded. Thus, the single-stranded extension 
referred to: in this method may be present as such 
in the linker-primer, or it may be produced from 
a double-stranded region of linker-primer by 
cleavage with a restriction enzyme following 
ligation of the linker-primer to the cDNA. ] 

(iii) enzymatically synthesizing the 
second strand of the cDNA using the first strand 
as the template under conditions such that 
single-stranded extensions on the synthesized 
cDNA molecule are made double-stranded; 
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[Typically, the second strand is synthesized by 
DNA polymerase I from the nicks on the RNA moiety 
introduced by RNase H associated with the reverse 
transcriptase. The linker-primer is converted to 
the double-stranded form in the first or second 
strand synthesis step. T4 DNA polymerase 
treatment makes double-stranded any 
single-stranded extensions remaining on the 
synthesized cDNA molecule.] 

(iv) ligating onto the blunt-ended cDNA 
resulting from synthesizing the second strand, an 
adaptor DNA segment comprising a second non- 
symmetrical restriction enzyme recognition 
sequence that is nonidentical to the first non- 
symmetrical restriction enzyme recognition 
sequence; [In the case of a principal embodiment 
of this aspect of the invention, ligation of the 
adaptor ligation directly adds one single- 
stranded extension to the cDNA molecule • 
Alternatively, however, this extension could be 
exposed by cleavage of the recognition site on a 
double-stranded portion of the adaptor after 
ligation to the cDNA. ] 

(v) exposing the cDNA resulting from 
ligation with the adaptor to one or more 
restriction enzymes that can cleave the first and 
second non-symmetrical restriction enzyme 
recognition sequences under conditions such that 
both of these sequences are cleaved, resulting in 
the vector DNA having two single-stranded ends 
that are not complementary; 

[This restriction causes exposure of at least one 
of the single-stranded extensions needed on the 
cDNA by cleavage of the recognition site on the 
repaired linker-primer portion of the cDNA 
molecule. If the non-symmetrical site in the 
adaptor is also uncleaved at this point, it may 
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also be restricted at this step. In a principal 
embodiment , a single enzyme can cleave the non- 
symmetrical sites on both the linker-primer and 
the adaptor; but in other embodiments, two 
different enzymes may be required.] 

(vi) ligatijig the cDNA resulting from 
cleavage with the enzymes to DNA of a genetic 
cloning vector, where the vector comprises 

at least one replicon; and 

a site for inserting DNA segments to be 
cloned that includes at least two non-symmetrical 
restriction enzyme recognition sequences, 

and where in the vector DNA, at least 
two non-symmetrical restriction enzyme 
recognition sequences have been cleaved by one 
or more enzymes that can cleave those recognition 
sequences, Resulting in vector DNA having two 
single-stranded ends that are not complementary; 
wherein further, 

[Thus, the two ends of the cleaved vector DNA 
cannot anneal and be ligated together. Cleavage 
of the vectpr DNA at both non-symmetrical sites 
usually releases a short DNA segment from between 
the, the "stuff er M ; for the highest yield of 
clones containing cDNA inserts, this stuff er is 
removed from the cleaved vector DNA prior to 
ligation of the vector with cDNA. ] 

; one of the single-stranded ends on the 
cleaved vector DNA has a sequence that is 
complementary to the single-stranded extension on 
the linker-primer attached to the cDNA; and 

the other single-stranded end on the 
cleaved vector pNA has a sequence that is 
complementary to the single-stranded extension on 
the adaptor attached to the cDNA; and 
[Thus the cDNA cannot circularize and is attached 
to the vector in a specific direction.] 
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(vii) transforming a suitable host cell 
with the recombinant DNA segment comprising the 
cDNA and the vector DNA that results from the 
ligation of cDNA to vector DNA; and 

[Various genetic transformation methods known in 
the art may be used. In a principal embodiment, 
the vector is a form of bacteriophage A and, 
therefore, the recombinant DNA containing cDNA 
inserts is packaged in vitro into phage particles 
which are then used to infect a bacterial host 
cell. Alternatively, for example, CaCl 2 
precipitation of DNA may be used to transform 
host cells, especially mammalian cells.] 

(viii) identifying a clone of host 
cells, resulting from transformation with said 
recombinant DNA, that contains a recombinant DNA 
segment including said cDNA. 

[Various strategies well known in the art of 
genetic engineering may be used to identify a 
clone of the desired cDNA, including 
hybridization with nucleic acid probes, 
immunological detection of expressed antigens, 
and assays for functional products, to name but a 
few. ] 

The strategy underlying this cDNA 
cloning method of the present invention is based 
on the following theory, explained in terms of 
particular examples of a principal embodiment, • 
which is presented to aid in understanding the 
method and does not in any way limit the scope of 
the invention as defined by the appended claims. 

When vector and insert DNA fragments are 
mixed and ligated in a typical cloning 
experiment, several molecules are produced in 
addition to those desired. These include 
self -ligation products of vectors or inserts, 
head-to-tail or head-to-head dimers of vector or 
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insert, and vector DNA containing multiple 
inserts* Fcunaation of these molecules would 
reduce t^xe cloning efficiency. Even when vector 
and insert DNAs are made by cutting with two 
5 different enzymes, formation of ligation products 
such as head-to-head dimers can not be 
eliminated, although the population of desired 
molecules is significantly higher and insertion 
occurs in a defined orientation. 

10 The reason why these self-ligation 

products an4 dimers are made, as noted above, is 
that majority of restriction enzymes in common 
usage recognize sequences of dyad symmetry. The 
two sticky ends- (S + and S~) created with an 

15 enzyme contain the same single-stranded 

extensions, and all combinations of the ends 
including S + and S + can be ligated. However, 
certain restriction enzymes cleave the 
non-symmetrical site (A) , yielding two different 

20 sticky ends (A + and A~) . In this case, only A + 
and A" ends can be ligated (see Fig. 2). When a 
vector DNA containing two different sites (A and 
B) with this feature is cleaved by restriction 
enzymes of this kind, the stuff er fragment hemmed 

25 by the sites is removed, and ligation is 
performed with inserts having sticky ends 
complementary to those of the vector, 
theoretically all of the clones obtained contain 
single inserts in the defined orientation. 

30 In a principal embodiment of this aspect 

of the invention, the restriction enzyme Sfil was 
chosen to gleave both the A and B sites, because 
Sfil is an infrequent cutter and leaves a 
non-symmetrical 3' extension of three nucleotides 

35 (Fig. 2A) . Since the central 5 bases in the 
recognition site can be any sequence, two Sfil 
sites (A and B) were designed and introduced into 
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the vectors (Figs, i and 2B) . The cDNA fragments 
to be inserted into the vectors were oriented by 
the use of oligo(dT) primers having attached the 
sequence of the Sfi 1(B) site. 

The steps for cDNA synthesis are 
schematically shown in Fig. 3. During the first 
strand synthesis reactions, the single-stranded 
linker-primer is repaired so as to be double- 
stranded. After cDNA molecules are blunt-ended, 
an adaptor, designated Sfil adaptor, having the 
3 f extension which fits to the Sfll (A + ) end, is 
ligated. After cleavage by Sfll, the resulting 
cDNA molecules have different 3' extensions which 
fit on the vector ends to achieve directional 
cloning (Fig. 2C) . 

Thus, regardless of the sequences of the 
three-base single-stranded sticky ends, inversion 
of one Sfi I end sequence can never produce a 
self -complementary sequence. Accordingly, 
regardless of the sequence of the five arbitrary 
internal base pairs within any Sfll cleavage 
site, the polarity of the complementarity of the 
sticky ends will always be maintained. Thus, 
such inherently non-symmetrical sticky ends, as 
well as .non-symmetrical variants of recognition 
sequences for which some forms can have dyad 
symmetry, as described above (e.g., SstXI) , are 
also useful for practicing the automatic 
directional cloning method of the present 
invention, for efficient ligation of DNA 
fragments in a predirected order. Screenings of 
cDNA libraries constructed by this method, as 
described below, demonstrated that cDNAs of up to 
6.4 kilobase pairs containing complete coding 
sequences could be isolated at high efficiency. 
Thus, this cloning system is particularly useful 
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for the isolation of cDNAs of relatively long 
transcripts' present even at low abundance in 
cells. 

The present invention further relates to 
DNA of a genetic,- cloning vector comprising at 
least one replicon; and a site for inserting DNA 
segments to be cloned that includes at least two 
nonidenticai restriction enzyme recognition 
sequences that are non-symmetrical , in which the 
non-symmetrical restriction enzyme recognition 
sequences* have been cleaved by one or more 
enzymes that can cleave them, so that the DNA is 
ready for use in cloning DNA segments having 
matching sticky ends. 

In a specific embodiment, exemplified 
below, the present invention relates to a cloning 
vector suitable for use in cloning DNA segments 
for cDNAs, by means of stable phenotypic changes 
induced by a specific DNA segments. An example 
of such vectors is ApCEV27 which differs 
(described above) as follows: 

(1) The M-MLV LTR fragment is replaced 
by one derived from pZIPneoSV(X) ; one skilled in 
the art will appreciate, however, that similar 
fragments can be used. 

(2) The bona fide promoter of the neo 
gene is temoved to fuse the SV40 promoter 
directly to^the neo structural gene. This 
modification eliminates ATG codens upstream from 
the translation initiation site of the neo gene, 
thereby increasing expression of the neo gene in 
mammalian cells. One skilled in the art will 
appreciate that the neo gene can still be 
expressed from the trp-lac fused promoter in 
bacteria (i.e., E. coli) • 

(3) The second selectable marker in 
bacterial cells, the ampicillin resistance gene 
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(amp), is introduced to permit select transformed 
bacterial (i.e., E. coli) cells resistant to both 
ampicillin and kanamycin, thus avoiding selection 
of truncated plasmid clones . One skilled in the 
art will appreciate that alternative markers can 
be used. 

(4) The sites for two additional 
infrequent cutters, Xhol and Mlul, were included 
along with the Notl site. Alternative infrequent 
cutters can also be used to effect the purpose of 
efficiently effecting plasmid rescue. 

(5) The multiple cloning site (MCS) 
contains the restriction sites for BamHI, Sail, 
Sfil(A), EcoRI, Bgfl, Hindlll, Sfi(B), Sail, and 
BstEll, in more convenient order. 

(6) The SP6-P and T7-P phage promoters 
were introduced to synthesize sense and anti- 
sense RNa of cDNA inserts, respectively. A 
Alternative promoters can also be used. 

(7) A phage origin was introduced to 
synthesize single-stranded DNA from the vector, 
in the case of ApCEV27, fl is used. 

(8) The rat preproinsulin gene 
polyadenylation signal is added for efficient 
expression of DNA inserts (alternative signals 
can be used to effect the desired end result) . 

(9) The replication origin of pUC19 is 
used to increase the copy number of pCEV27. The 
replication origin of pCEV9 and pCEVIS is derived 
from a short fragment of pBR322; since the ori 
sequence lacks the promoter for replication 
initiation, unstable replication results and thus 
lower copy number. Replication origins similar 
to pUC19 can also be used to increase copy 
number. 

Finally, the present invention also 
relates to a reagent kit comprising cleaved 
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vector DNA, .ready for use in cloning, as 
described above, and further including a linker- 
primer having a single-stranded end that is 
complementary to one single-stranded end of the 
cleaved vector DNA; and an adaptor which after 
cleavage by a suitable restriction enzyme, has a 
single^stranded end that is complementary to the 
other >single-stranded end of the cleaved vector 
DNA. -One skilled in the art of genetic 
engineering would appreciate that such a kit 
might advantageously also include appropriate 
quantities of enzymes, buffers and other reagents 
needed for the practice of the automatic 
directional cloning method according to the 
teachings of the present invention. 

The present invention may be understood 
more readily by reference to the following 
detailed description of specific embodiments and 
the Examples and Figures included therein. 

BRIE? DESCRIPTION Of TPB DRAWING? 

Fig. 1. structures of the vectors. 
Panel (A) , ApCEV9 and panel (B) , ApCEV15. Each 
vector contains a plasmid DNA within the A DNA. 
An expanded map of the plasmid portion is shown 
with derivation 6f the DNA segments and the 
location ot several restriction sites including 
the multiple cloning site (MCS) . Arrows show the 
locations of the promoters and the direction of 
transcription. 

Fig. 2. Scheme of the automatic 
directional cloning system. Panel (A) , 
Nucleotide sequences of the Sfil sites. The 
general structure of the Sfil site is shown at 
the top, wtfere the letter N denotes any 
nucleotide. The two Sfil sites specific to the 
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vectors, Sfil (A) and Sfil(B), are shown under the 
general structure. The upper strands are shown 
in the 5' to 3« direction, while the lower ones 
are in the opposite direction. The bottom of the 
figure shows the sequences of the ends produced 
by the Sfil cleavage of the general Sfil site. 
The left and right half sites are denoted as 
Sfil( + ) and Sfil (~) , respectively. Similarly, 
the sequences of Sfil(A + ), SfiI(A~), Sfil(B + ), 
and SfiI(B~) half sites can be derived from the 
sequences of the Sfil (A) and Sfil(B) sites. 
Panel (B) , preparation of the ApCEV vector arms. 
ApCEV vector DNA is shown at the top, where cosL 
and cosR represent the left and right cohesive 
ends of A, respectively. The locations of the 
Sfil (A) and Sfil(B) sites are shown as (A) and 
(B) , respectively. Following ligation to seal 
the cohesive ends (in the middle), the DNA is 
cleaved by Sfil to expose the SfiI(A + ) and 
SfiI(B ) sticky ends of the vector molecules. 
The small stuffer fragment is removed to prepare 
the vector arms shown at the bottom. The 
sequences of the single-stranded extensions are 
shown at both the 3' -ends of the cDNA and vector 
arms. Panel (C) , formation of the A concatemers 
containing cDNA inserts. The cDNA fragment shown 
at the left side are prepared to give the 
Sfil (A - ) and Sfii(B + ) sticky ends to the molecule 
by the procedure described in Fig. 3. When the 
fragments are ligated with the prepared vector 
arms, alternating concatemers consisting of the 
cDNA inserts and vector arms in the defined 
orientation are produced automatically due to the 
base-pairing specificity as shown in the middle. 
In vitro packaging extracts cut out the DNA 
segments hemmed by the two cos sites from the 
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concatemer to form the active A phage particles 
as shown at the bottom. 

Fig. 3. Schematic view of the cDNA 
synthesis. An mRNA molecule is shown at the top 
with the capf structure (m Gppp) and the poly (A) 
stretch (AAAAA) at the 5»- and 3»- ends, 
respectively. The linker-primer is the 
single-stranded oligonucleotide which contains, 
the oligo(dT) at the 3» half, and the Sfil(B) 
site (shown by an asterisk) at the 5' half. The 
first strand is synthesized by Moloney murine 
leukemia virus reverse transcriptase (M-MLV RT) 
from the linker-primer hybridized with the RNA 
molecule. The second strand is synthesized by 
DMA polymerase.^ from the nicks on the RNA moiety 
introduced by fiNase H. The linker-primer is 
converted to the double-stranded form in the 
first or second strand synthesis step. T4 DNA 
polymerase treatment makes double-stranded any 
single-stranded extensions remaining on the 
synthesized cDNA molecule. The Sfil adaptor 
ligation adds one 3 1 s ingle-stranded extension, 
the Sfil (A J sticky end, to the cDNA molecule. 
Another 3 1 extension, the SfiI(B + ) sticky end, is 
exposed by Sfil cleavage of the Sfil(B) site on 
the repaired linker-primer portion of the cDNA 
molecule. 

Fig. 4.> Cloning of a model insert into 
pCEVIS using the ADC method. Panel (A) , 
restriction map of pCEV15-RAS. The plasmid was 
constructed by cloning a 0.7 kbp fragment 
containing the mouse H-ras (v-Jbas) coding 
sequence (Reddy et al. r 1985) into the EcdRl site 
of pCEV15. The open thick arc and closed thin 
arc represent the tf-ras insert and vector, 
respectively. Panel (B) , analysis of ligation 
products. pCEV15-£AS DNA was digested with Sfil, 
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and vector (4.2 kb) as well as insert (0.7 kb) 
fragments were purified from the gel. Similarly, 
EcoKI/Apal fragments were prepared as controls. 
The vector and/ or H-ras insert Sfil fragments 
(left half) or f?coRI/ApaI fragments (right half) 
were incubated in kinase ligase buffer (see 
below) with or without T4 DNA ligase as 
indicated. The ligation products were analyzed 
by agarose gel electrophoresis. Sizes of the 
fragments are shown in kb. Panel (C) , HindlZI 
digestion of pCEV15 and its derivatives, (lane 
a) pCEVlS; (lane b) pCEV15-RAS ; (lane c) pCEVlS 
containing the H-ras insert in opposite 
orientation; (lane d) a marker (1-kb ladder; 
Bethesda. Research Laboratories) ; and :( other 
lanes) Plasmid DNAs isolated from 20 individual 
kanamycin-resistant colonies obtained by 
transformation of DH5a with the vector ligated to 
H-ras insert Sfil fragments. 

Fig. 5. PDGF receptor clones isolated 
from the ApCEV9-M426 cDNA library. Panels (A) 
and (B) , cDNA clones encoding for p and a PDGF 
receptors, respectively. The structure of each 
PDGF receptor cDNA is schematically shown with 
restriction sites. Open boxes represent coding 
sequences, while non-coding sequences are shown 
by bars. The clones shown by thick lines were 
isolated from the M426 cDNA library. The thin 
lines represent clones isolated from other 
libraries as described (Matsui,T., et al. , 1989, 
Science 243, 800-804): HB15, HB3, and HB6 were 
derived from the human brain stem cell cDNA 
library in Agtll (provided by R. Lazzarini; 
Matsui et al., 1989, supra); HF1 from the 
Okayama-Berg human fibroblast cDNA library 
(Okayama and Berg, 1982) ; and EF17 from a 
randomly-primed M426 cDNA library in Agtll 
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(Matsui et al., 1989, supra). Panels (C) and 
(D) , nucleotide sequences of 5 • -untranslated 
regions of p and a PD6F receptor clones HPR5 and 
TR4, respectively. Sequencing was performed by 
the chain termination method (Sanger et al., 
1977, Proc. Natl. Acad. Sci. USA 74, 5463-5467). 
The initiation codons are underlined. 

Figure 6. Structure of the cDNA 
cloning-expression vector \ pCEV27 . 

Structure of the ApCEV27 genome is shown 
at the upper half with the location of 1 genes. 
The plasmid part is enlarged and shown at the 
lower half as a circular map. The multiple 
excision site (MES) contains the restriction 
sites for infrequent cutters; NotI, Xhol, Pvul, 
and Mlul. The multiple cloning site (MCS) 
contains the restriction sites for BamHI, Sail, 
Sfil(A), EcoRI, Bglll, HindlH, Sfil(B), Sail, 
and BstEII, and was placed in the clockwise 
orientation. The two Sfil sites are used to 
insert cDNA molecules by the automatic 
directional cloning method (Miki, T., et al. 
(1989) Gene 83, 137-146.), and the two Sail sites 
are used to release the inserts. SP6-P and T7-P 
represent the phage promoters for SP6 and T7 RNA 
polymerases, respectively. The trp-lac fused 
promoter tac and SV40 early promoter are used to 
express d . the >neo structural gene in E. coli 
(kanamycin resistance) and eukaryotic cells 
(G-4 18 resistance) , respectively. The directions 
of transcription from the promoters are shown by 
the arrows. Polyadenylation signals are labeled 
as polyA. Tt*e locations of the replication 
origins (ori) ^nd the ampicillin resistant gene 
(amp) are shown. 
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Figure 7. Strategies for expression cloning of 
transforming gene cDNAs. 

Cells of NIH 3T3 are transf ected by 
,\pCEV27 cDNA library DNA. Transformed cells are 
isolated from induced foci and assayed for G-418 
resistance and colony formation in soft agar. 
Cells are expanded and the genomic DNA isolated* 
The DNA is digested by either NotI, Xhol or Mlul, 
and then ligated in a low concentration. A 
bacterial strain is transformed by the ligated 
DNA and colonies resistant to both the ampicillin 
and kanamycin are isolated. Plasmid DNA is 
extracted from each colony and used to transfect 
NIH 3T3 cells to examine focus formation. Since 
cDNA library-induced foci is presumed to contain 
multiple cDNA clones, transforming plasmids are 
identified in this transfect ion assay. 

Figure 8. Detection of hepl cDNA inserts in the 
ST18-2 cDNA library-indcuced transf ormants. 

The genomic DNA from individual 
transf ormants (from CT18-1A to CT18-1G) were 
digested by Sail which can release the cDNA 
inserts (see Figure 6) . The digested DNA (5 mg) 
was separated on a 0.5% agarose gel by 
electrophoresis and transferred to a supported 
nitrocellulose membrane (Nitrocellulose GTG, FMC 
BioProducts) . The Southern blot was probed by 
the hepl cDNA insert of pHEPl-B which was rescued 
from the transformant T18-B. The NIH 3T3 genomic 
DNA was used as a negative control. The location 
of each fragment of the molecular size marker (1 
kb ladder, BRL) is shown at the right side in kb. 

Figure 9. Sequence homology of the hepl and 
B-raf gene cDNAs. 



40 

& restriction xaap of the cDNA insert of 
pHEPl-B was schematically shown at the top (a) . 
The regions where nucleotide sequence was 
determined are shown by arrows and labeled by A 
and B. The sequences are shown below (b) . 
Computer analysis was performed by 
IntelliGenetics. programs. B-raf sequence was 
taken from and numbered as in Ikawa et al. 
(1988). 

Figure 10 • Rearrangement and amplification of 
the hepl locus in the primary and secondary 
transformants. 

ThQ sources of DNA and restriction 
enzymes useil are shown at the top. The strains 
PT-1 and PT-2 are the primary transf ormants 
induced by the original tumor DNA. The strain 
18-1 was a secondary transf ormant induced by PT-2 
DNA and is the source of the cDNA library. The 
genomic DNA (5 mg) was digested by Sail to 
release the cDNA inserts , separated on a 0.5% 
agarose gel by electrophoresis, and transferred 
to a supported nitrocellulose membrane. The 
Southern blot was probed by the hepl cDNA insert 
of pHEPl-B. The NIH 3T3 genomic DNA was used as 
a negative control. The location of several 
fragments of the molecular size markers (1 kb 
ladder and high molecular weight DNA marker, BRL) 
are shown at the right side in kb. 

Figure 11. Detection of mRNAs for the bral and 
B-raf genes. 

The poly (A)* KNA extracted from the cells 
indicated were denatured and separated on a 
formaldehyde gel, RNA was transferred to a 
supported nitrocellulose membrane and probed by 
each probe. The 5' probe was isolated as the 
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Sall-Hindlll fragment (see Figure 8). The 3» 
probe was prepared by polymerase chain reaction 
(PCR) using GeneAmp kit (Cetus Co.) from CT18-2B 
genomic DNA. The B-raf primer 
(5 • -CCTCGAGATTCAAGTGATGAC-3 1 ) and the T7 primer 
(5 ' -CTAATACGACT CACTATAGGGG-3 • ) were used for PCR 
and the amplified fragment was purified from an 
agarose gel. 

FIG. 12 cell morphology of control 
NIH/3T3 and transf ormants induced by keratinocyte 
cDNA expresion library. 

NIH/3T3 cells (A) and NIH/3T3 cells 
transfected with the ectl (B) , ect2 (C) , or ect3 
(D) at 21 days post-trans feet ion. Cells were 
maintained in Dulbecco's modified Eagle's medium 
(DMEM) containing 5% calf serum, (x 180) 

FIG. 13 Specific binding of [ m I]-KGF to 
BALB/MK, NIH/3T3, and NIH/ectl, NiH/ect2, and 
NIH/ect3 transf ormants. 

Methods: Recombinant KGF was radiolabeled with 
[ m IJ-Na by the chloramine-T method as described 
previously. Confluent cultures in 2 4 -well plates 
were serum-starved for 24 h, followed by 
incubation with HEPES binding buffer (HBB; 100 mM 
HEPES, 150 mM NaCl, 5 mM KC1, 1.2 mM MgS0 4 , 8.8 mM 
dextrose r 2 mg/ml heparin, and 0.1% BSA, pH 7.4) 
containing [ ,J5 I]-KGF for l h at 22 "C. The cells 
were then washed with cold PBS, lysed with 0.5% 
SDS, and cell-associated radioactivity was 
measured in a gamma counter. Bound cpm were 
normalized to the cell protein content of SDS 
extracts. Specific binding was determined by 
subtracting normalized cpm of samples incubated 
with 100-fold excess unlabeled KGF from the 
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normalized cpm bound in the presence of [ m I]-KGF 
alone. 

FIG. 14 DNA and RNA analysis of the 
ectl sequence. a, Southern analysis of the 
Sail-digested DNAs from NIH/3T3 and its 
transformants. The blot was probed with the 
entire ectl cDNA insert. Since Sail is an 
infrequent cutter of mammalian DNA, most of the 
DNA fragments are extremely large and migrate 
near the origin of the gel. However, the cDNA 
inserts released by Sail from the vector are 
shorter and migrate into the gel allowing the 
detection of the insert without detection of the 
endogenous ectl gene. 

b, Southern analysis of EcoRI-digested 
DNAs of different animal species (Clontech, Palo 
Alto, CA) . The blot was probed with the 5' -half 
of the ectl cDNA insert and washed under reduced 
stringency conditions. 

c, Northern analysis of NIH/3T3 and 
BALB/MK RNA, The blot was probed with the 
5«-half of the ectl cDNA (lanes 1 and 2) or 
b-actin cDNA (lanes 3 and 4) and washed under 
stringent conditions. 

Methods: For plasmid rescue t genomic DNA was 
cleaved by one of the infrequent cutters which 
can release the plasmids containing cDNA inserts. 
Digested DNA was ligated under diluted conditions 
and used to transform bacterial competent cells. 
Plasmids were isolated from ampicillin- and 
kanamycin-resistant transformants and used to 
transfect NIH/3T3 cells to examine for focus 
formation. The ectl plasmid was rescued by Xhol, 
while the ect2 and ect3 plasmids were rescued by 
NotI digestion. For Southern analysis, DNA (10 
ng) was digested by sail (Panel a) or EcoRI 
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(Panel b) , fractionated by agarose gel 
electrophoresis, and transferred to a 
nylon-supported nitrocellulose paper 
(Nitrocellulose-GTG, FMC, Rockland, ME) . The 
blot in Panel a was hybridized with the 
3, P-labeled entire ectl insert at 42° C and washed 
at 65»C in o.l x SSC, while the blot in Panel b 
was hybridized with the 3, P- labeled 5 '-ectl probe 
(see Fig. 15b) at 37° C and washed at 55° C in 
0.1 x SSC. Location of DNA molecular weight 
markers (BRL, Gaithersburg, MD) is indicated in 
kb. For Northern analysis (Panel c) , poly(A)*RNA 
(5 ng each) was fractionated by formaldehyde gel, 
transferred to Nitrocellulose GTG, and hybridized 
with the 5' ectl probe (lanes l and 2). After 
autoradiography, the filter was boiled to remove 
the probe and then hybridized with a b-actin 
probe (Gunning et al Molec. Cell Biol. 3, 787- 
795 (1983)) (lanes 3 and 4). Location of 
molecular weight markers (BRL, Gaithersburg, MD) 
is indicated in kb. 

All hybridization experiments were 
performed at the indicated temperature in a 
solution containing 50% formamide, 5 x SSC, 2.5 x 
Denhardt's solution, 7 mM Tris-HCl (pH 7.5), 0.1 
mg/ml of denatured calf thymus DNA, and 0.1 mg/ml 
of tRNA. 

FIG. 15 Nucleotide sequence of the KGF receptor 
cDNA. 

a, Nucleotide sequence and deduced 
amino acid sequence of the coding region of the 
KGF receptor cDNA. Nucleotides are numbered 
from the 5 '-end of the cDNA. Initiation and 
termination codons are underlined. Amino acids 
are numbered from the putative initiation site of 
translation and shown above the amino acid 
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sequence. Potential sites of N-linked 
glycosylation are indicated by dots above the 
residues. The potential signal peptide and 
trans -membrane domains are underlined. The 
5 interkinase ^domain is shown by underlined italic 
letters. Glycine residues considered to be 
involved in # ATP binding are indicated by 
asterisks. Cysteine residues delimit two 
immunoglobulin-like domains in the extra-cellular 
10 portion of the molecule are shown by : over the 
residues. Nucleotide sequence was determined by 
the chain termination method (Sanger et al PNAS 
74, 5463-5467 (1977)) . 

b, Structural comparison of the 

15 predicted KGF and >FGF receptors. The region 
used as a probe ,f or Southern and Northern 
analysis (Fig. 14b and c) is indicated. The 
region homologous to ^the published bek sequence 31 
is also shown. The schematic structure of the 

20 KGF receptor is shown below the restriction map 
of the cDNA clone • Amino acid sequence 
similarities with .the smaller and larger bFGF 
receptor variants are indicated. S, signal 
peptide; IG1, 162 , and IG3, immunoglobulin-like 

25 domains; A, acidic region;, TM, transmembrane 
domain; JM, jvuctamembrane domain; TK1 and TK2, 
tyrosine kinase domains; IK, interkinase domain; 
C, C-terminus domain. Amino acid sequence 
comparison was performed using the method of 

30 Pearson and Lipman (Pearson et al, PNAS 85, 2444- 
2448 (1988) >. 

FIG. 16 Competition of KGF, aFGF, and bFGF for 
[ m I]-KGF binding on BALB/MK cells (A) and 
NIH/ectl cells (B) . 
35 Methods: Binding assays were performed as 

described for Fig. 14, except that cells were 
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incubated with [ ,M I]-KGF in the presence of 
unlabeled KGF, aFGF or bFGF at concentrations 
indicated on the x-axis. For Scatchard analysis, 
samples contained several concentrations of 
5 [ ,M I]-KGF (1-100 ng/ml) ih the presence or absence 
of a 100-fold excess unlabeled -KGF, and were also 
processed as outlined in Fig. 13. Estimates of 
receptor affinity and total binding capacity were 
made using LIGAND software (Munson et al, Anal. 
10 Biochem. 107, 220-239 (1980)). 

FIG. 17 a, Covalent affinity crosslinking of 
["'I ] -KGF to BALB/MK (left), NIH/3T3 (center), and 
NIH/ectl cultures (right) . The left and center 
panels of this autoradiogram were exposed to 

15 Kodak XAR film for 72 h at -70°C; the right panel 
is an 18 h exposure of the sane autoradiogram. 
The second lane for each cell type shows 
crosslinking performed in the presence of excess 
unlabeled KGF. Molecular weight markers are 

20 indicated on the left; the positions of 

[ m I]-KGF-cross linked complexes are indicated by 
arrows. 

b, Autoradiogram of 
phosphotyrosyl-proteins from intact NIH/3T3 

25 (left) and NIH/ectl cells (right) before and 
after treatment with KGF. Molecular weight 
markers are indicated on the left; the estimated 
molecular weights of proteins displaying 
KGF-stimulated phosphorylation on tyrosine are 

30 shown at right. 

Methods: Samples for covalent crosslinking were 
prepared from confluent, serum-starved cultures 
in 6 cm dishes, using 10 ng/ml [" 5 I)-KGF in the. 
presence or absence of 30-fold excess KGF. After 

35 binding (as described for Fig. 17), crosslinking 
with disuccinimidyl suberate was performed as 



46 

described previously. The cells were then 
scraped into cold HBB containing o.lmM aprotinin 
and l.o mM phenylmethylsulf onylfluoride, and a 
crude membrane fraction was generated by brief 
sonication (50 W, 10 sec) , low-speed 
centrifugation (600 x g, 10 min) , and high-speed 
centrifugation (100,000 x g, 30 min) of the 
low-speed supernatant. The membrane pellet was 
solubilized in Laemmli sample buffer (Laemmli, 
Nature 227, 680-685 (1970), containing 100 mM 
dithiothreitol, and boiled for 3 min. 
[ ,25 I] -labeled proteins were resolved by 7.5% 
SDS-PAGE and autoradiography of the dried gel. 
Analysis of phpsphoproteins was performed as 
follows. Confluent cultures in 10 cm dishes were 
serum-staryed for 24 h, then treated with (+) or 
without (-) KGF (30 ng/ml) for 10 min at 37°C 
The medium was aspirated, and the cells were 
solubilized in cold HEPES buffer containing 1% 
Triton X-lOQf protease- and 

phosphatase-inhibitors as described previously. 
The lysate was cleared by centrifugation, and 
phosphotyrosyl proteins were immunoprecipitated 
with affinity-purified anti-Ptyr adsorbed to 
GammaBind 6-agarose (Genex, Gaithersburg, MD) . 
Phosphotyrosyl proteins were specifically eluted 
using 50 mM phenyl phosphate, diluted in Laemmli 
sample buffer, and resolved by 7.5% SDS-PAGE. 
Proteins were then transferred to nitrocellulose 
and detected with anti-Ptyr and [ ,25 I]-protein-A as 
described previously. 

DESCRIPTION OF SPECIFIC EMBODIMENTS 
In one aspect, the present invention 
relates to a vector having nonidentical non- 
symmetrical restriction- enzyme recognition site 
sequences, $s described above, also including 
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regulatory elements located such that the 
sequences of an inserted DNA segment are 
transcribed, where the regulatory elements are at 
least partly of eukaryotic origin. A principal 
embodiment of this aspect of the present 
invention is exemplified by two A-plasmid 
composite vectors, ApCEVis and ApCEV9, the 
structures of which are depicted in Figure l and 
described further below, in Example 1. 

To examine the performance of the ADC 
method, a model H-ras insert was prepared so as 
to have SfiI(A~) and 5fiI(B + ) ends (Fig. 4A) and 
ligated with the pCEVlS Sfil fragment. To show 
the difference between the ADC method and the 
"forced" cloning method using two different 
restriction enzymes, a similar H-ras fragment 
with EcoRI and Apal ends was prepared (Fig. 4A) 
and ligated with the pCEVlS EcoRI/Apal fragment. 
To measure the efficiency of cDNA cloning using a 
natural template, 2.5 mg of a poly (A) + RNA 
preparation was denatured by heating and used to 
synthesize cDNA from a linker-primer. The 
results of all these experiments, described in 
Example 2, below, illustrate the remarkable 
efficiency of cloning of model inserts using this 
novel method of the present invention. 

To assess the performance of the cDNA 
cloning method, cDNA was synthesized using 
poly (A) RNA extracted from M426 human embryonic 
lung fibroblast cells under the conditions 
described in Example 2, below. cDNA molecules 
larger than l kb were selected by low melting 
point agarose gel electrophoresis, and two 
aliguots were used to clone into ApCEV9 and 
ApCEV15. The average size of the cDNA inserts 
was 2.0 kb in the ApCEV9 library (6 x 10* 
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Independent- clones) and 2,2 Jcb in the ApCEVIS 
library (l x 10 7 independent clones). 

To characterize the M426 cDNA library in 
ApCEV* further, it was screened for the human fi 
5 PDGF receptor cDNA. Before this cDNA library was 
constructed, clones isolated (HB3 and HB15) by 
screening several other libraries did not contain 
the entire coding sequence (Fig. 5A) . When a 
part (9 x 10 5 pfu) of the M426 cDNA library was 

10 screened for the human fi PDGF receptor, six 

clones were isolated. Of these, three contained 
inserts of approximately 5 kb. Sequence analysis 
showed that two (HPR2 and HPR5) contained the 
entire coding sequence (Fig. 5A) . Recently, 

15 Hatsui et al., 1989, supra) have identified the 
cDNA of a novel PDGF receptor, designated the a 
PDGF receptor by isolation of overlapping cDNA 
clones (HF1, HB6 and EF17 in Fig. 5B) . 
Re-probing of filters from M426 cDNA library for 

20 the human a PDGF receptor resulted in the 

isolation of 93 clones. Of 7 clones analyzed, 5 
including TR4 contained inserts of 6.4 kb, which 
corresponded to the size of the message (Fig. 
5B) . As shown in Figs. 5C and 5D, sequence 

25 analysis of a and fi receptor cDNAs isolated from 
the M426 library revealed 5'- untranslated 
sequences followed by initiation codons for the 
complete coding sequence of each gene. These 
results indicated that the cDNA cloning system 

30 described here ^suitable for isolation of 
relatively long cDNAs. 

r This method has been used for more than 
one year in the laboratory of these inventors, 
without public disclosure, to construct several 

35 cDNA libraries. Screening of the libraries for 
growth factors and receptors has been performed, 
and in most cases cDNA clones containing the 
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entire coding sequence have been obtained as 
single clones. For example, a number of cDNA 
clones encoding keratinocyte growth factor have 
been isolated from the ApCEV9-M426 cDNA library 
using oligonucleotide probes. Screening of an 
MCF7 cDNA library in ApCEV9 constructed by the 
ADC method for a novel erbB-related gene resulted 
in the isolation of the cDNA clones of 5 kb with 
high frequency as well. All of these findings 
indicate that the ADC method using ApCEV vectors 
makes it possible to clone relatively long cDNAs 
very efficiently. 

From the present data, the following 
conclusions may be drawn: 

(1) The ADC procedure resulted in very 
high cloning efficiency ( I0 7 "l0 8 clones/^g of 
mRNA) . 

(2) Usually backgrounds of libraries 
constructed by this method are very low. When 
the vector arms are prepared carefully, almost 
all of the clones contain cDNA inserts. 

(3) cDNAs are inserted into the vectors 
in a directed orientation and as single inserts, 
making analysis of cDNA inserts simple and 
straightforward. 

(4) The vectors can accommodate longer 
inserts than other A vectors without sacrificing 
cloning efficiency, making possible to clone 
relatively long cDNA fragments. 

(5) Plasmids carrying cDNA inserts can 
be released from A genomes by NotI cleavage. 
This feature facilitates the structural analysis 
of cDNA clones, permits the generation of 
size-selected plasmid sub-libraries, and makes it 
possible to recover the cDNA clones by plasmid 
rescue from eukaryotic cells. 
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The inventors have further refined the 
vectors to allow high levels of cDNA expression 
in mammalian cells and the ability to perform 
plasmid rescue ♦ Example 3 tests the potential of 
5 the improved approach by its application to the 
isolation and characterization of unknown 
oncogenes from hepatocellular carcinomas of the 
BjCjF, mouse strain, extensively utilized in long 
term carcinogenesis testing in the United States. 
10 Example 4 discloses the utility of the 

refined directional cDNA library expression 
vector for isolating the keratinocyte growth 
factor (KGF) receptor cDNA by creation of an 
autocrine transforming loop. This expression 
15 cloning approach was successful to identify and 
functionally clone the receptor for the new 
growth factor. 

Example 1. Construction of APCEV15 and AdCEV9. 

The following materials and methods were 
used in this and the subsequent examples, as 
needed. 

Restriction enzymes, DNA polymerases, T4 
DNA ligase, and T4 polynucleotide kinase were 
purchased from New England BioLabs, Bethesda 
Research Laboratories, and Boehringer Mannheim. 
M-MLV reverse transcriptase and RNaseH were from 
Bethesda Research Laboratories. Bacterial 
strain LE392; F~, hsdR511(rk~ mk~) supE44 supF58 
lacYl or D(lacIZY)6 galK2 galT22 metBl trpR55 was 
used as a host of A. DH5a (Bethesda Research 
Laboratories) was used for bacterial 
transformation. NZY broth (log NZ amine, 5g 
NaCl, 5g Yeast extract in 1 1, pH 7.5) was used 
to grow bacterial strains. M426 is a human lung 
embryonic fibroblast cell line (Aaronson and 
Todaro, 1968, J. virology 36, 254-261). 
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Oligonucleotides were synthesized by a Beckman 
System 1 DNA Synthesizer and purified by high 
performance liquid chromatography. 
Oligonucleotides utilized had the following 
sequences • 

#1 2 GATCCGTCGACGGCCATTATGGCCAGAATTCTGGGCCCG , 
#2 : TCGACGGGCCCAGAATTCTGGCCATAATGGCCGTCGACG , 
#3 : AATTCAGGCCGCCTCGGCCAAGCTTAGATCTGGGCCCG , 
#4 : TCGACGGGCCCAGATCTAAGCTTGGCCGAGGCGGCCTG , 
#5:TGGATGGATGG, 
#6 : CCATCCATCCATAA , 

#7 and #8 : GG ACAGGCCGAGGCGGCC (T) n , where n=20 or 
40 in the case of #7 and #8, respectively. 

Plasmid DNA was prepared by the 
"selective precipitation procedure" which is a 
modification of the alkaline lysis method 
(Birnboim and Doly, 1979, Nucleic Acids Res. 7, 
1513-1523). This technique makes it possible to 
prepare sufficient pure plasmid DNAs to analyze 
and alter structures, without a requirement for 
lysozyme treatment, phenol extraction or repeated 
ethanol precipitations. Cells collected from a 
10 ml culture were resuspended into 0.2 ml of TEG 
(25 mM Tris-HCl pH 8.0/10 mM EDTA/50 mM 
glucose) . After transfer to a microcentrifuge 
tube, 0.2 ml each of 2% sodium dodecyl sulfate, 
and .0.4 M NaOH was added, mixed, and incubated at 
room temperature for 5 min. After the addition 
of 0.2 ml of 3M ammonium acetate (pH 4.8), 
incubation at 0C for 10 min, and centrifugation 
for 15 min in a microcentrifuge, the supernatant 
was transferred to a fresh tube containing 0.2 ml 
of 2 M Tris-HCl (pH 8.9) and 2 ml of 2 mg/ml 
RNase A. Following incubation at 37 C for 30 min 
and centrifugation, the supernatant was 
transferred to a new tube containing 0.6 ml cold 
isopropanol. The tube was inverted several 
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times and incubated at room temperature for 10 
min, DNA was collected by centrifugation and 
then washed with 75% ethanol. The pellet was 
dried by incubation at 37 C for 5 min and 
resuspended into 50 ml of 10 mM Tris-HCl (pH 
8.0) , 1 mM EDTA. . . 

The A DNAs prepared as follows were used 
to modify the structure, analyze cDNA clones in 
ApCEV vectors and then to rescue the plasmid 
part. Host cells grown in 10 ml of NZY medium 
containing 2 mM MgCl 2 were suspended into the 
same volume" of SM buffer (50 mM Tris-HCl, pH 
7.5/8 mM MgSO 4 /10O mM NaCl/0.01 % gelatin). A 
single plague picked by a pasteur pipet was 
incubated with 0.1 ml of the host cell suspension 
at 37 C for 30 min. Ten ml of pre-warmed NZY 
broth containing 2 mM MgCl 2 was added and shaken 
at 37 C for 6 h. This procedure allows 
single-step production of high-titer lysates. 
Phage particles were precipitated and DNAs 
prepared as described by Arber et al, 1983 [In 
Hendrix, R. W. , Roberts, J. W. , Stahl f F. W. and 
Weisberg, R. A. (Eds.), Lambda II. Cold Spring 
Harbor Laboratory/ Cold Spring Harbor, NY, 1983, 
pp. 433-466], with several modifications. A few 
drops of chloroform were added to the lysate, 
mixed and debris was removed by centrifugation. 
After the chloroform remaining in the lysate was 
removed by incubation at 37 C, 50 1 each of DNase 
I (1 mg/ml) and RNase A (1 mg/ml) were added, and 
incubated at 37 C for 1 h. Phage particles were 
precipitated tiy the addition of 5 ml of 30% 
PEG/3 M NaCl/10 mM MgCl 2 followed by incubation 
on ice for 1 h. Phage particles were collected 
by centrifugation at 3,000 rpm for 30 min., and 
the pellet was resuspended in 0.5 ml of SM. The 
suspension was transferred to a fresh 
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microcentrifuge tube containing 20 1 of 
proteinase K (20 mg/ml) . After incubation at 
37 C for 15 min, 50 1 of 100 mM Tris-HCl (pH8.0) 
/ lOOmM EDTA/ 1% sodium dodecylsulfate was added 
and the tube was incubated at 65 C for 30 min. 
The released DNA was extracted by 
phenol/chloroform and then by chloroform. DNA 
was precipitated by 0.6 volume of isopropanol 
with 0.3 volume of 7.5 M ammonium acetate, washed 
with 75% ethanol and dried. The pellet was 
dissolved in 10 mM Tris-HCl (pH 8.0), 1 mM EDTA, 
0.1 mg/ml of RNase A and incubated at 37 C for 3 0 
min to digest ribosomal RNAs. 

Plasmid DNAs prepared by the selective 
precipitation method were directly used to modify 
the structures, by restriction enzyme digestions, 
repair reactions, and/or ligation with synthetic 
linkers. DNA fragments were separated on agarose 
gels and then purified using GENECLEAN (BIO 101 
Inc. ) . 

Insertion of oligonucleotides was 
performed as follows. The two strands of 
non-phosphorylated oligonucleotides were annealed 
and ligated with plasmid DNA which has been 
digested with suitable restriction enzymes. One 
strand of the oligonucleotides which was not 
ligated (due to the 5" -OH structure) was removed 
by heating and then separated by agarose gel 
electrophoresis. The purified fragments were 
phosphorylated and then ligated. 

Plasmid pCEV9 was constructed as 
follows. A retroviral vector pDOlT (Korman et 
al., 1987, Proc. Natl. Acad. Sci. USA 84, 
2150-2154) was cleaved by XJbal and recircularized 
to remove the polyoma segment. The Clal site was 
converted to a NotI site by linker insertion, and 
the EcoRI site was removed by repair ligation. A 
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synthetic MCS linker consisting of 
oligonucleotides #1 and #2 (see Example 1, below) 
was inserted between the Sail and BamHI sites, 
and the Bcll/BamHl fragment of SV40 DNA (Bethesda 
Research Laboratories) containing its 
polyadenylation signal was inserted at the Xhol 
site. These manipulations produced pCEV9. 

Agtll DNA (Young and Davis, 1983, Proc. 
Natl. Acad. Sci. USA 80, 1194-1198) was ligated 
and then cleaved by EcoRI and XJbal. The ends 
were repaired and Notl-linkered, and then pCEV9 
DNA linearized by NotI was cloned into the A DNA 
to produce ApCEV9. 

pC|5V15 was constructed by modifying 
pCEV8 r which is a parental plasmid of pCEV9 and 
lacks the MCS linker. A tac promoter fragment 
(Pharmacia) and Xhol linker were inserted in the 
Sail site* Sfil(B), tfindlll, and Bgrlll sites 
were removed successively by restriction enzyme 
digestion, polymerase treatment, and subsequent 
ligation re^ption. Removal of the Sfil site (on 
the SV40 replication origin) did not impair the 
SV40 early promoter activity. The MCS linker 
was inserted between the BamHI and Sail sites, 
and Sfil(B), tfindlll, and Bgrlll sites were 
introduced again by insertion of the MCS-2 linker 
(oligonucleotides #3 and #4, below) between the 
EcoRI and Aj^al sites. The resulting plasmid 
pCEVIS was cloned in a new A vector constructed 
as follows. The segment spanning from the Xhol 
site to the right cos end of ApCEV9 was replaced 
by the corresponding fragment of Acharon28 (Rimm 
et al., 1980, Gene 12, 301-309), to introduce the 
cl deletion (KH54) and remove three Hindlll 
sites. The resulting phage ApCEV9c DNA was 
ligated, cleaved by ffindlll, and then repaired. 
A Notl linker was ligated to the repaired Hindlll 
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ends and the DNA was cleaved by Notl. The A arms 
were purified and ligated with Notl-digested 
pCEV15 DNA, to produce ApCEV15. 

Example 2. Efficiency of cloning a n d orientation 
5 of model inserts. 

Preparation of A arms and the Sfil 
adaptor for all cloning experiments was performed 
as follows. A DNA was ligated to seal cohesive 
ends and then cleaved sequentially by Sfil and 

10 EcoRI. After phenol /chloroform extraction, 
ApCEV9 arms were purified by centrifugation 
through a 5-20% potassium acetate gradient 
(Maniatis et al., 1982). ApCEV15 arms were 
purified by passage through a Sephadex G-50 spin 

15 column (Boehringer Mannheim Biochem. ) . 

The 5/iI adaptor was prepared as 
follows. About l nmol of oligonucleotides #5 and 
#6 were separately phosphorylated by T4 
polynucleotide kinase, mixed, heated at 80 °C for 

20 5 min and then slowly cooled to 4°C. 

RNA for making cDNAs was extracted and 
Poly (A) + RNA selected as described by Okayama 
et al. (1987). cDNA was synthesized essentially 
according to D'Alessio et al. (1987) with some 

25 modifications. About 2.5 mg of poly (A) + RNA in 
10 ml of H* 2 0 was mixed with 0.5 ml of lOOmM 
raethylmercuric hydroxide (MeHg) , and incubated at 
room temperature for 5 min, followed by addition 
of 0.5 ml of 1.4M b-mercaptoethanol . After 5 

30 min, 1.2 ml of RNasin (40 units/ml; Pr omega 

Biotech.), 17.8 ml of H 2 0, 10 ml of 5x FS buffer 
(250mM Tris-HCl, pH 8.3/375mM KC1/15 mM MgCl 2 /100 
mM dithiothreitol) , 2.5 ml of dNTP mixture (10 
mM each of dGTP, dATP, dTTP and dCTP) , 5 ml of 

35 linker-primer (oligonucleotide #8; 1 mg/ml) and 
2.5 ml of M-MLV reverse transcriptase (200 
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units/ml) were sequentially added, mixed, and 
incubated at 37 °C for 1 h. The tube was chilled 
on ice and 290 ml of H 2 0, 7.5 ml of dNTP mix (10 
mM each of dGTP, dATP, dCTP, and TTP) , 40 ml of 
5 lOx SS buffer (188 mM Tris-HCl, pH8.3 / 906 mM 
KC1 / 46 mM MgCl 2 / "38 mM DTT) , 10 ml Of DNA 
polymerase I (1.25 units /ml) and 1.8 ml of RNase 
H (0.25 units/ml) were added, mixed and 
incubated at 16 °C for 2 h. The reaction mixture 

10 was heated at 7p°C for 10 min, and 5 ml of T4 
DNA polymerase (l unit/ml) was added and 
incubated at 37°C for 10 min. The reaction was 
terminated by the addition of 40 ml of 0.25M 
EDTA, and the mixture was extracted by 

15 phenol/chloroform twice followed by chloroform 

twice. cDNA was ethanol-precipitated from 2.5 M 
ammonium acetate., washed, and then dried. The 
pellet was dissolved into 10 ml of H z O and then 
4 ml of Sfii adaptor (0.8 mg/ml) , 4 ml of 5x 

20 ligation buffer (500 mM Tris-HCl, pH7. 6/100 mM 
MgCl 2 /io mM ATP/lomM dithiothreitol/ 50% (w/v) 
polyethylene glycol-8000) and 2 ml of T4 DNA 
ligase (1 unit/ml) were mixed and incubated at 
14 °C overnight. A 10 ml aliquot was then mixed 

25 with l ml of lox Sfll buffer (100 mM Tris-HCl, 
pH 7.9 / 500 mM NaCl / 100 mM MgCl 2 / 60 mM 
b-mercaptoethanol / 1 mg/ml bovine serum 
albumin), 2 ml of t sf±I (10 units/ml) and 7 ml of 
H 2 o. Digestion was performed at 50 C for 1 h. 

30 cDNA fragments were purified by low-melting point 
agarose gel electrophoresis or passing through a 
spun column (Maniatis et al., Molecular Cloning, 
Cold Spring Harbor' 1982) packed with Sepharose 
CL-4B (Pharmacia) . An aliquot of the cDNA 

35 preparation >as then mixed with ApCEV vector 

arms and ethanol-precipitated. DNA was dissolved 
in 8 ml of^H 2 o, and then 1 ml of lOx 
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kinase-ligase buffer (660 mM Tris-HCl, pH7. 5/100 
mM MgCl 2 /50 mM dithiothreitol, 500 mM ATP) and 1 
ml of T4 DNA ligase (l unit/ml) were added, 
mixed, and incubated at 14 °C overnight. In 
vitro packaging was performed using GigaPack Gold 
(StrataGene) as directed. 

As shown in Fig. 4B, neither of the 
vector nor H-ras insert Sfil fragment was 
self -ligated, while self -ligation occurred when 
the EcoRl/Apal fragments were used instead. In 
the ADC system, ligation occurred only when both 
the vector and insert fragments were present in 
the reaction mixture (Fig. 4B) . To characterize 
the directional cloning capacity of the system, 
the H-ras insert and vector Sfil fragments were 
ligated, used to transform an E. coli strain 
DH5a, and 20 kanamycin-resistant colonies were 
analyzed. As shown in Fig. 4C, all plasmids 
contained single inserts in the expected 
orientation, indicating that the ADC method 
provides both directional cloning and positive 
selection for the presence of inserts. To 
further examine the performance of the ADC method 
using the A system, model inserts prepared to 
have Sfil (A") and S/iI(B + ) ends were ligated with 
ApCEV9 arms (see Fig. 2C) . As shown in Table I, 

pCEV9 arms alone did not produce active phages 
efficiently even when the ligation reaction was 
carried out, while presence of model inserts in 
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Table I.' Packaging efficiency of ApCEV9 DNA 



DNA Titer 3 
5 (pfu/Mg A arms) 

ApCEV9 arms 1 x 10* 

ApCEV9 arms, ligated 8 x 10* 

ApCEV9 arms + Insert A b , ligated 8 x 10* 

10 ApCEV9 arms + Insert B c , ligated 8 x 10' 



Footnotes for Table I: 

a The reaction mixture contained 66 mM Tris-HCl 
(pH7.5), 10 mM MgCl 2 , 5 mM dithiothreitol , 50 mM ATP, and 0.1 
15 Mg/ml of DNA. Incubation was performed at 14 °C overnight, and 
the phage were produced by in vitro packaging and titered on 
LE392. 

b A 2 kb DNA fragment having the SfiI(A~) and 
SfiI(B + ) ends (see Fig. 2). 
20 A DNA fragment similar to the insert A, except that 

the sriI(A~) end was created by ligation of the Sfil adaptor. 
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the ligation mixture increased the titer of active 
phages by three orders of magnitude. These results 
indicated that successful ligation and phage 
propagation depended on the presence of the model 
5 insert in the reaction mixture. All of thse 
findings indicated that the cloning procedure 
results in low background and efficient directional 
cloning. 

To measure the efficiency of cDNA cloning 

10 using a natural template, 2.5 mg of a poly (A) + RNA 
preparation was denatured by heating and used to 
synthesize cDNA from a linker-primer 
(oligonucleotide #7) . The cDNA was blunt-ended, and 
the Sfil adaptor was ligated to both the ends. The 

15 molecules were cleaved partially by Sfil, and then 
cloned in ApCEV9. A total of 2.5 x io 8 plaque 
forming units (pfu) was obtained, indicating that 
the method was extremely efficient. 

Since Sfil is an infrequent cutter, almost 

20 all cDNA species in the libraries constructed by the 
ADC method should remain intact. Nonetheless, 
cDNAs containing Sfil sites might be excluded from 
our cdna libraries. To solve the problem, partial 
Sfil digestion is usually performed. An alternative 

25 strategy involves cDNA synthesis from a 

linker-primer containing the recognition site of 
another infrequent cutter Mini, in addition to the 
Sfil site. The cDNA preparation could then be 
divided into two parts, one cleaved by Sfil and the 

30 other by tflul. a short oligonucleotide ligated to 
the cDNAs could be utilized to convert the Wlul end 
to an SfiI(B + ) end. 

Example 3. Isolation of a Mouse Hepatoma Oncogene 
cDNA Using a Novel Phenotypic Expression Cloning 
35 System. 
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The following protocols/methodologies are 
referred to in sections 3 . 1 — 3 . 6 below. 

cDHA Library Construction 

cDNA libraries were constructed as 
5 described above except the use of newly-designed 
adaptor and vector. The new Sfil adaptor does not 
contain the ATG codon in the sense strand and was 
consisted of vtwo oligonucleotides, 
5»-CCAATCGCGACC-3» and 5 « -GGTC6CGATTGGTAA-3 • . 
10 Amplification >of the library and preparation of the 
DNA were performed by the standard procedures 
(Maniatis et al. 1982) . 

Cell Culture and DNA transfection 

Alj. cells used were the derivatives of NIH 

15 3T3 (Jainchill, J.L. , et al. (1969) J". Virol, p. 
549.). calcium phosphate transfection (Wigler, M. 
et al., (1977) Cell^jLl, 223.) was used to introduce 
DNA into cells. Cells were maintained in Dulbecco's 
modified Eagle's medium (DMEM) containing 5% calf 

20 serum. 

Plasmid Rescue ■ ■ 

The genomic DNA (1.2 ng) isolated from 
CT18-2B or CT18-2C was cleaved by Xhol, extracted 
with phenol-chloroform and then chloroform. The DNA 

25 was precipitated and resuspended in 355 m of H,o. 
Fifty nl of 10 x kinase-ligase buffer (660 mM 
Tris-HCl, pH7. 5/100 mM Mgci,/50 mM dithiothreitol , 
500 mm ATP) and 5 units of T4DNA ligase (BRL) were 
added and incubated overnight at 15 °C. The ligated 

30 DNA was extracted and precipitated as above and then 
resuspended in 10 pi of TE. Four aliquotes (100 til 
each) of PLK-F' competent cells (Stratagene) were 
transformed by 0.5, 1, 2, and 4 fil of the ligated 
DNA as directed by the manufacturer. After the heat 
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shock, the cells were dileted 10 fold with S.O.C. 
medium (BRL) containing 1 mM IPTG to induce 
expression of the neo gene drived by the tac 
promoter. The culture was incubated for 2 h with 
shaking and plated on NZY hard agar containing 
ampicillin (100 /xg/ml) , kanamycin (25 mg/ml) , and 
IPTG (100 jiM/ml). 

Recombinant DNA Techniques 

Preparation of 1 and plasmid DNA was 
performed as described above. Genomic DNA was 
extracted by the standard procedure (Maniatis et 
al., 1982). Total RNA was isolated and 
poly (A) -selected as described by (Okayama et al. 
(1987) PNAS 84, 8573) . 

Southern and Northern Analysis 

DNA fragments were isolated by Geneclean 
(BIO101 Inc.) and labeled by random priming using 
Oligo Labeling kit (Pharmacia). Hybridizations were 
performed as described (Kraus et al., 1987). 

3.1 Development of an Efficient Stable Expression 
cDNA Cloning System 

The^pCEV27 system was developed to clone 
cDNAs by means of stable phenotypic changes induced 
by a specific cDNA. Use of a ^-plasmid composite 
vector made it possible to generate high complexity 
cDNA libraries and to efficiently excise the plasmid 
from the stably integrated phagemid DNA. This 
phagemid vector (Figure 6) contained several 
features including two Sf il sites for construction 
of cDNA libraries using the automatic directional 
cloning (ADC) method, an M-MLV LTR promoter suitable 
for cDNA expression in mammalian cells, the SV40 
promoter-driven neo gene as a selectable marker, and 
multiple excision sites (MESs) for plasmid rescue 
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from genomic DNA. The X-pCEV27 system incorporated, 
in addition to the M-MLV LTR, the rat preproinsulin 
polyadenylation (polyA) signal downstream from the 
cDNA cloning site (Fig. 6 ) . in this vector, the 
5 bacterial neo gene was placed under the independent 
control of the SV40 early promoter and the SV40 late 
polyA signal for use in marker selection in 
mammalian cells, in contrast to ApCEVIS, the bona 
fide promoter of the neo gene was removed so as to 
10 fuse the SV40 promoter directly to the neo 

structural gene. Thus, in ApCEVis, expression of 
the neo gene in E. coll was achieved by 
transcription from the trp-lac fused promoter tac, 
inserted upstream from the SV40 early promoter (Fig. 
15 6). By use of the tac promoter, it was possible to 
utilize IPTG-inducible selection for kanamycin 
resistance. Finally, the fi replication origin and, 
SP6 and T7 phage promoters were included to 
facillitate analysis of cDNA inserts by production 
20 of single-stranded DNA and RNA transcripts, 
respectively (Fig. 6) . 

The strategy for expression cloning of 
oncogene cDNAs is summarized in Figure 7. when 
library cDNA is used to transfect mammalian cells, 
25 cDNA clones are integrated with recombination 

between 1 DNA and host genomic DNA. For plasmid 
rescue, genomic DNA extracted from transformant is 
subjected to digestion with an enzyme which can 
cleave the ,\ -plasmid junctions. The resulting DNA 
30 can then be circularized and used for bacterial 

transformation. For this purpose, the sites for two 
additional infrequent cutters, xhol and Mlul, were 
included along with the Notl site. Because of the 
second selectable marker in bacterial cells, the 
35 ampicillin resistance gene (amp) , it was possible to 
select transformed E. coli cells resistant to both 
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ampicillin and kanamycin, avoiding selection of 
truncated plasmid clones. 

3.2 Characterization of Oncogenes Activated in 
Mouse Liver Tumors 

We have previously analyzed hepatocellular 
tumors of the mouse strain B6C3F1 for the presence 
of activated oncogenes (Reynolds et al. (1987) 
Science 237, 1309-1316). Although the majority were 
activated ras or c-raf oncogenes , four could not be 
identified. The sources of these oncogenes were 
tumors designated OT4, OT18, OT23, and OT28. One 
(OT23) was spontaneously generated, while the others 
were associated with chronic furfural exposure 
(Reynolds, S.H. , et al., (1987)). Genomic DNAs of 
NIH/3T3 transformants containing each of the 
unidentified oncogenes were examined under low 
stringency hybridization conditions using a number 
of known and potential oncogene probes including 
abl, myb, ets, fos, fgr, fms, rel, src, sis, yes, 
p53, ros, PDGF-A, met r dbl, IskT, myc, N-myc, rho, 
mos, erbA, pin, lea, H-ras, N-ras, K-ras, c-raf, and 
erbB-2. None showed DNA fragments with either 
increased intensity or abnormal sizes relative to 
those detected in NIH/3T3 control DNA (data not 
shown). Thus, none of these transforming genes 
appeared to be closely related to any of the genes 
used as probes. 

3.3 Expression cDNA Cloning of an Oncogene of a 
Furfural-induced Hepatoma 

Using the / \pCEV27 expression cloning 
system, we attempted to clone transforming cDNA 
from one furfural-induced tumor, 0T18. A cDNA 
library (3 x 10 6 independent clones) was constructed 
from poly (A)' RNA extracted from a secondary 
transformant of the tumor. Transfection of 80 
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plates of NIB7-3T3 calls with 5 /ig/plate of the 
expression cDNA library led to the detection of 
seven foci, which demonstrated G-418 resistance. 
These results indicated that each had taken up and 
stably integrated vector DNA, making it likely that 
the transformed foci were induced by exogenous cDNA 
rather than arising as a result of spontaneous 
transformation. 

, Two of these transformants, designated 
CT18-2B and CT18-2C, were selected for plasmid 
rescue. By restriction mapping of several distinct 
plasmids obtained, from each transf ormant , it was 
possible to establish that one plasmid rescued from 
each had the^ identical insert (data not. shown) . 
These results suggested that this cDNA might encode 
the oncogene product. Transf ection analysis of each 
rescued plasmid DNA demonstrated that these same two 
cDNA clones possessed high-titered transforming 
activity of around 10 J ffu/nmol DNA, while none of 
the other plasmids rescued from the same 
transf ormants showed detectable activity. 

To determine whether other transformants 
induced by -the cDNA library, contained the 0T18 
oncogene, genomic DNA extracted from each primary 
transf ormant was digested with Sail to release cDNA 
inserts from the vector and subjected to Southern 
blotting analysis using the QT18 cDNA insert as a 
probe, since Sail is an infrequent cutter for 
mammalian DNA, genomic DNA was cleaved to very large 
fragments which remained near the origin of the gel. 
Thus, the relatively small cDNA fragments released 
from cellular DNAs by Sail cleavage could be 
separated from the bulk of genomic fragments. As 
shown in Figure 8, each of the cDNA library 
transformants contained the 0T18 oncogene cDNA 
insert. The -sizes ranged from around 2.3 kb to 7 
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kb, suggesting that serveral of the inserts 
represented independent cDNA clones of the oncogene. 

3.4 Structural Analysis of T18 oncogene cdna 

A detailed restriction map of the 2.1 kb 
insert of one of the transforming plasmids was 
constructed, and the clone was subjected to sequence 
analysis. A database search indicated that that the 
5' portion of the cDNA contained an unknown 
sequence, while the 3 1 region was closely related to 
human B-raf (Ikawa et al., (1988) Mol. Cell. Biol. 
8, 2651-2654), and chicken Rmil (Marx et al., (1988) 
EMBO 7, 3369-3373) genes (Figure 9a). Comparison of 
the predicted amino acid sequences with that of B- 
raf indicated identity (Figure 9) with the exception 
of a single amino acid difference at position 324, 
in which Gly was substituted for Ala in human B-raf. 

There was also complete identity with 
avian R-mil except for a small stretch of nine amino 
acids at the R-mil C-terminus, where recombination 
with an avian retroviral env gene caused this 
substitution. 

To determine the breakpoint, we also 
compared the T18 nucleotide sequence with that of 
proto B-raf and v-R-mii. There was no homology with 
either sequence upstream from position 1040 in the 
T18 oncogene. Thus, position 1040 represents the 
junction between an unknown sequence and the B-raf 
gene. R-mil is a viral onocogene and encodes a gag- 
R-mil-env fusion protein. The junction of gag and 
R-mil has been mapped 144 nucleotide upstream from 
the T-18 break point (Marx et al., 1988), while the 
junction of a different sequence and the human B- 
raf oncogene was 174 nucleotides upstream from the 
junction in the T18 oncogene. In each of the B-raf 
oncogenes, including T18, the breakpoints did not 
disrupt the predicted kinase domain of the protein. 
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3.5 Evidence for in Vivo Oncogene Activation 

The human B-raf gene product is around 84% 
related to the aminp acid sequence of the c-raf 
oncogene. Another member of the raf family, A-raf , 
5 is also structurally similar. Most raf oncogenes 
have been activated by mechanisms involving 
structural rearrangements due to recombination and 
loss of amino terminal sequences of the raf coding 
sequence (Rapp, U.R. , et al., (1988) In: Reddy, E.P. 

10 (ed.) The Onpogene Handbook, Elsevier Science 

Publishers B*V. ). Moreover, most reported instances 
have involved in vitro activation of these genes 
during DNA tapansf ection rather than by mechanisms 
leading to oncogene activation within the tumor 

15 itself (Ishikawa, F., et. al., (1987) T. Mol. Cell. 
Biol., 7, 1226) ; (Stanton, V.P. and Cooper, G.M. 

(1987) Mol Cell. Biol. 7, 1171); (Ikawa, S., et al., 

(1988) Mol., Cell. Biol. 8, 2651-2654). The 
rearrangement activating the B-raf oncogene might 

20 have occurred during the course of cDNA library 

construction or as an artifact of DNA transf ection 
with the original tumor DNA. Alternatively, the 
oncogene Blight have been activated within the 
original tujjor itself. 

25 While original tumor DNA was not 

available, it was possible to analyze two primary 
transf ormants which had been independently induced 
by this tumor DNA. As shown in Figure 10, 
rearrangement as well as amplification of both 

30 non-B-raf and related B-raf portions of the gene 
were found not only in the secondary transf ectant, 
which was the source of the cDNA library, but in 
both primary transf ormants, PT18-1 and PT18-2. 
Since such in vi£ro rearrangements are very rare, 

35 these findings strongly argue that the oncogene was 
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activated in vivo in the hepatoma as part of the 
neoplastic process. 

3.6 Detection of the mRNAs for the T18 Oncogene. 

In an effort to characterize the B-raf 
oncogene transcript and search for evidence of 
additional B-raf oncogenes among the 3 other 
hepatoma oncogenes, we subjected polyA selected RNAs 
from primary or secondary NIH/3T3 transfectants 
containing each oncogene to analysis with DNA probe 
from 5' (B-raf unrelated) and 3« (B-raf) portions of 
the T18 oncogene. As shown in Fig. 11, control 
NIH/3T3 cells contained a 4.2 kb RNA that hybridized 
with the 5' non-B-raf related portion of the 
oncogene but there was no detectable B-raf 
transcript, in contrast, the second cycle T18 
transfectant, which was the source of expression 
cDNA library, showed a major 4.2-kb as well as minor 
10- and 3-kb transcripts which appeared to hybridize 
with both probes. 

Fig. li further shows that a primary T23 
oncogene transfectant contained multiple B-raf 
hybridizing transcripts, indicating that it also 
contained another B-raf oncogene. Of note, the 
several transcripts detected differed in their sizes 
from those of the T18 oncogene. Moreover, none of 
these transcripts was detected by the B-raf probe of 
the .T18 oncogene (Fig. li) . Thus, if the T23 
oncogene arose by a mechanism involving B-raf gene 
rearrangement, this rearrangement was different from 
that associated with activation of the T18 oncogene. 
The transfectant induced by the T28 oncogene (Fig. 
11) did not show abnortmal B-raf hybridizing RNAs, 
arguing that this oncogene must be distinct from B- 
raf. 
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Example 4. eDSA Expression Cloning of the 
Keratinocyte Growth Factor Receptor by 
Complementation for Autocrine Transformation 

4.1 Identification of epithelial cell cDNAs capable 
5 of transforming NIH/3T3 cells. 

We prepared a cDNA library from BALB/MK 
epidermal keratinocytes (Weissman, B.E. & Aaronson, 
S.A. Cell 32, 599-606 (1983)) using the automatic 
directional cloning (Miki, T. et al., Gene 83:137- 

10 146, (1989)) in an improved expression vector 
lpCEV27 (Miki, unpublished data). A library of 
4.5 x 10* independent clones was amplified, phage 
particles purified, and their DNA extracted. DNA 
transfection of NIH/3T3 mouse embryo fibroblasts 

15 (Jainchill, J.L. et al., J. Virol 4, 549-553 

(1969)), which synthesize KGF, was performed by the 
calcium phosphate technique (Wigler, M. et al. Cell 
11, 223-232 (1977)). Individual plates were 
examined at 10-18 days for the appearance of 

20 transformed foci. We detected 15 foci among a total 
of 100 individual cultures transfected with 5 mg 
library cDNA/plate. Each focus was tested and shown 
to be resistant to G418, indicating that it 
contained integrated vector sequences. Three 

25 representative transformants were chosen for more 
detailed characterization based upon differences in 
their morphologies (Fig. 13). 

When we performed plasmid rescue, each 
transformant gave rise to at least 3 distinct cDNA 

30 clones as determined by physical mapping. To 

examine their biological activities, each clone was 
subjected to transfection analysis on NIH/3T3 cells. 
A single clone rescued from each transformant was 
found to possess high-titered transforming activity 

35 ranging from lO'-io 4 focus forming units/nmole DNA. 
Moreover, the morphology of foci induced by each 
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cDNA was similar to that of the parental 
transformant. Because of their distinct physical 
maps and distinguishable biological properties, we 
tentatively designated the genes for these 
transforming cDNAs as epithelial cell transforming 
(ect) 1, 2 and 3. Transfectants induced by the 
individual transforming plasmids were utilized in 
subsequent analyses. 

4.2 Specific KGF binding by ectl-transformed cells. 

Suramin is an agent known to interfere 
with ligand-receptor interactions including the 
binding of PDGF (Fleming, T.P. et al, Proc. Natn. 
Acad. Sci. U.S.A. 86, 8063-8067 (1989); Belsholtz, 
C. et al. Proc. Natn. Acad. Sci. U.S.A. 83, 6440- 
6444 (1986)) and KGF with their respective 
receptors. When an ectl transfectant was exposed to 
suramin, its proliferation was markedly inhibited, 
associated with reversion of the transformed 
phenotype (data not shown) . To further investigate 
the possibility that ectl might encode the KGF 
receptor, we performed binding studies with 
recombinant [ ,25 I]-KGF as the tracer molecule. As 
shown in Fig. 14, BALB/MK cells demonstrated 
specif ic high affinity binding of [ ,25 I]-KGF while 
there was no such binding detectable to NIH/3T3 
cells. Of note, expression of the ectl gene by 
NIH/3T3 cells resulted in the acquisition of 
3.5-fold more [ ,25 I]-KGF binding sites than BALB/MK 
cells (Fig. 13). Under these same conditions, 
neither NIH/ect2 nor ect3 bound significant levels 
of the labelled growth factor. These results 
strongly suggested that ectl encoded the KGF 
receptor, whose introduction into NIH/3T3 cells had 
completed an autocrine transforming loop. 
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4.3 Molecular characterization of ectl. 

To characterize ectl, the transforming 
4.2kb cDNA released by Sail digestion was used as a 
molecular probe to hybridize Sail restricted genomic 
DNAs of NIH/3T3 as well as NIH/3T3 transfectants 
containing ectl, ect2 or ect3. While the expected 
4.2kb DNA fragment was detected in the ectl 
transformant (Pig. 15a), neither NIH/3T3 nor the 
other transfectants showed evidence of a Sail 
fragment hybridized by the cDNA insert. These 
results further argued that the ect2 and ect3 
represented independent transforming genes. When 
EcoRI was used to cleave normal mouse DNA, we 
observed several distinct ectl hybridizing DNA 
fragments, which reflected endogenous ectl sequences 
or closely related genes (Fig. 15b) . Ectl related 
sequences were also demonstrated in the DNAs of 
other species analyzed, including human, indicating 
its high degree of conservation in vertebrate 
evolution. ^ 

When we analyzed expression of transcripts 
related to ectl in BALB/MK and NIH/3T3 cells, we 
observed a single transcript of around 4.2 kb in 
BALB/MK cells (Fig. 15c). Thus, our cDNA clone 
represented essentially the complete ectl 
transcript. In NIH/3T3 cells, a transcript of 
comparable size was only faintly detectable under 
relatively stringent hybridization conditions. We 
estimated that its expression was several fold lower 
than the level of the 4.2 transcript in BALB/MK 
cells. Thus, if this transcript were to represent 
ectl rather than a related gene, its expression was 
markedly lower in fibroblasts as compared to 
epithelial cells. 
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4.4 Ectl encodes a transmembrane tyrosine kinase of 
the FGP receptor family. 

We next determined the nucleotide sequence 
of the ectl 4.2kb cDNA insert. Analysis of the 
predicted amino acid sequence revealed a long open 
reading frame of 2235 nucleotides (nucleotide 
position 562-2796) . Two methionine codons were 
found at nucleotide positions 619 and 676, 
respectively (Fig. 16a) . The second methionine 
codon perfectly matched the Kozak's Consensus for a 
translational initiator sequence (A/GCCATG6) (Kozak, 
M. Nucleic Acids Res. 15, 8125-8148 (1987)). 
Moreover, it was followed by a characteristic signal 
sequence of 21 residues of which 10 were identical 
to those of the putative signal peptide of the mouse 
bFGP receptor (Reid, H.H., et al., Proc. Natn. Acad. 
Sci U.S.A. 87, 1596-1600 (1990); Pasquale, E.B. & 
Singer, S.J. Proc. Natn. Acad. Sci. U.S.A. 86, 8722- 
8726 (1989) ; (Safran, A. et al. Oncogene 5, 635-643 
(1990)). Thus, it seems likely that the second ATG 
is the authentic initiation codon for the KGF 
receptor (KGFR) . If so, the receptor polypeptide 
would comprise 707 amino acids with a predicted size 
of 82.5 kd. 

The amino acid sequence of the KGFR 
predicted a transmembrane tyrosine kinase most 
closely related to the mouse bFGF receptor (bFGFR) . 
The percent similarity between both proteins is 
shown in Fig. 16b. The putative KGFR extracellular 
portion contained two immunoglobulin (IgG)-like 
domains, exhibiting 77% and 60% similarity with the 
IgG-like domains 2 and 3, respectively, of the mouse 
bFGFR. Recent studies have revealed a variant form 
of the bFGFR, whose extracellular domain also 
contains only these two corresponding IgG-like 
domains (Reid, H.H. , et al., Proc. Natn. Acad. Sci 
U.S.A. 87, 1596-1600 (1990). The sequence 
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N-terminal to the first IgG-like domain of the KGFR 
was 63 residues long in comparison to 88 residues 
found £n the shorter form of the mouse bFGFR. Both 
the chicken and mouse bFGFRs contain a series of 
eight consecutive acidic residues between the first 
and second IgG-like domains (Reid, H.H. , et al., 
Proc. natn. Acad. Sci U.S.A. 87, 1596-1600 (1990) ; 
Pasquale, E.B. & Singer, S.J. Proc. Natn. Acad. Sci. 
U.S.A. 86 , 8722^726 *1989); (Safran, A. et al. 
Oncogene 5, 635-643 (1990); (Lee, P.L. et al. 
Science 245, 57-60* (1989)) . This sequence is even 
retained in the shorter form of the mouse bFGFR, 
which lacks the first IgG-like domain (Fig. 16b) . 
However, the KGFR did not contain such an acidic 
domain. Whether this reflects significant 
functional differences between these receptors 
remains to be determined. 

The intracellular portion of the KGFR was 
highly homologous to the bFGFR tyrosine kinase (Fig. 
16) . The central core of the catalytic domain was 
flanked by a relatively long juxtamembrane sequence, 
and the tyrosine kinase domain was split by a short 
insert of 14 residues, similar to that observed in 
avian, human and murine bFGF receptors (Reid, H.H., 
et al., Proc. natn. Acad, sci U.S.A. 87, 1596-1600 
(1990); Pasquale, E.B.. & Singer, S.J. Proc. Natn. 
Acad. Sci. U.S.A. 86, i 8722-8726 (1989); (Safran, A., 
et al. oncogene 5, 635-643 (1990) ; (Lee, P.L. et al. 
Science 245, 57-60 (1989); (Ruta, M. et al. Oncogene 
3, 9-153 (1988); and (Ruta, M. et al. Proc. Natn. 
Acad. Sci. U.S.A. 86, 5449-5434 (1989)). Hanafusa 
and co-workers isolated a partial cDNA encoding a 
tyrosine kinase, designated bek, by bacterial 
expression cloning using phosphotyrosine antibodies 
(Kornbluth, S., et al., Molec. Cell Biol. 8, 5541- 
5544 .(1988) ) . The reported sequence of bek was 
identical to the KGFR in the tyrosine kinase domain 
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(Fig. 16b). Although only partical sequence of bek 
is available, it is very likely to encode the mouse 
KGF receptor. 



4.5 Functional analysis of the cloned KGF receptor. 

Because of the existence of more than one 
receptor of the FGF family (Reid, H.H., et al. Proc. 
Natn. Acad. Sci. U.S.A. 87, 1596-1600 (1990), we 
sought to characterize in detail the binding 
properties of the KGF receptor isolated by 
expression cloning. Scatchard analysis of [ ,J5 I]-KGF 
binding to the NIH/ectl trans fectant revealed 
expression of two similar high affinity receptor 
populations, out of a total of -3.8 x 10* 
sites/cell, 40% displayed a Kd of 180 pM, while the 
remaining 60% showed a Kd of 480 pM (data not 
shown) . These values are comparable to the high 
affinity KGF receptors displayed by BALB/MK cells. 
The pattern of KGF and FGF competition for 
['"I]-KGF binding to NIH/ectl cells was also very 
similar to that observed with BALB/MK cells (Fig. 
17). Although maximum [ ,25 I]-KGF binding to NIH/ectl 
cells was 3.5 fold higher than to BALB/MK, there was 
50% displacement by 2 ng/ml of either KGF or aFGF 
with each cell type. Similarly, both cells, showed 
15-fold less efficient competition by bFGF for bound 
[ ,n I]-KGF. Together with observations that parental 
NIH/3T3 cells lack detectable specific t' JS I]-KGF 
binding (Fig. 14), these results demonstrate that 
the cloned KGF receptor expressed in NIH/3T3 cells 
conferred the characteristic pattern of KGF and FGF 
competition displayed by BALB/MK cells. 

When [ ,,S I]-KGF is crosslinked to its 
receptors on BALB/MK cells, two protein species of 
165 and I37kd have been observed. Taking into 
account the size of KGF itself, we have estimated 



74 

the corresponding receptor species to be around 140 
and 115kd, respectively. When [ ,25 I]-KGF crosslinking 
was performed with NlH/ectl cells, we observed a 
single species corresponding in size to the smaller, 
137kd species in BALB/MK cells (Fig. 17a). 
Moreover,, detection of this band was specifically 
and efficiently blocked by unlabelled KGF. When 
glycosylation is considered, the size of the KGF 
receptor predicted by seguence analysis corresponds 
reasonably well with the corrected size (115 kd) of 
the crosslinked KGF receptor in the ectl 
transfectant* 

As a final test of the functional nature 
of the KGF receptor expressed in NIH/ectl cells, we 
investigated its capacity to induce tyrosine 
phosphorylation of cellular proteins. Thus, intact 
NIH/ectl cells were exposed to KGF for 10 min, and 
cell lysates were subjected to immunoprecipitation 
and immunoblbtting analysis utilizing 
anti-phosphotyrosine (anti-Ptyr) antibody. As shown 
in Fig. 17b, several putative substrates were 
tyrosine phosphorylated in response to KGF addition. 
These included pp55, pp65, pp90, ppll5, ppl50 and 
PP190. Previous studies have indicated that similar 
size proteins are phosphorylated in response to KGF 
triggering of BALB/MK cells. Moreover, the 115-kd 
phosphoprotein matches the corrected size of the KGF 
receptor crosslinked by [ ,n I]-KGF. Thus, it may 
reflect the autophosphorylated KGF receptor itself. 

*********** 

For purposes of completing the 
background description and present disclosure, each 
of the published articles, patents and patent 
applications heretofore identified in this 
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specification are hereby incorporated by reference 
into the specification. 

The foregoing invention has been 
described in some detail for purposes of clarity and 
understanding. It will also be obvious that various 
combinations in form and detail can be made without 
departing from the scope of the invention. 

one skilled in the art will 
appreciate that the capacity for efficient rescue of 
cDNA clones from mammalian cells is an important 
function of a stable mammalian expression cloning 
system. When plasmid cDNA libraries are used to 
transfect mammalian cells f single plasmids 
integrated in genomic DNA are difficult to release. 
Plasmid rescue is readily achieved only when 
multiple copies are clustered at a single 
integration site (Noda et al, PNAS, 86 , 162-166, 
1989). Excision of the plasmid by induction of 
replication from the SV40 origin using COS cell 
fusion often results in rearrangement or truncation 
of cDNA clones efficiently from stable integration 
sites within mammalian host cells, Applicants used a 
strategy involving -plasmid composite vectors. The 
vectors contain plasmid excision sites for multiple 
cutters including tfotl, MLuI and Xhol. This allows 
the intact plasmid containing insert to be 
efficiently rescued with low probability of internal 
cleavage of the insert. . . 
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WHAT IS CLAIMED IS:* 

1.4 A genetic cloning vector comprising 
at least one replicon; and 
. a site for inserting DNA segments to 
be cloned that includes at least two non-symmetrical 
restriction isozyme recognition sequences, wherein 

at least two of said non-symmetrical 
restriction enzyme recognition sequences are 
identical/ arid 

the first of said identical 
restriction enzyme recognition sequences is in the 
inverted orientation with respect to a second 
identical sequence, and 

said first and second identical 
restriction enzyme recognition sequences include 
greater than six positions having invariable DNA 
base pairs. 

•• ■ * 

2. The vector according to claim 1 
wherein said identical restriction enzyme 
recognition sequences can be cleaved by the 
restriction enzyme Sfil. 

.-lv..,. 

3 A genetic cloning vector comprising 
-at least one replicon; and 
a site for inserting DNA segments to 
be cloned that includes at least two non-symmetrical 
restriction enzyme recognition sequences , wherein 

at least two of said non-symmetrical 
restriction enzyme recognition sequences are 
nonidentical. 

4. The vector according to claim 3 
wherein at least two of said non-symmetrical 
restriction enzyme recognition sequences that are 
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nonidentical can be cleaved by a single restriction 
enzyme. 

5. The vector according to claim 4 
wherein said single restriction enzyme is the 
restriction enzyme BstXI. 

6. The vector according to claim 3 
wherein at least one of said non-symmetrical 
restriction enzyme recognition sequences includes 
greater than six positions having invariable DNA 
base pairs. 

7. The vector according to claim 6 
wherein at least one of said non-symmetrical 
restriction enzyme recognition sequences including 
greater than six positions having invariable DNA 
base pairs can be cleaved by the restriction enzyme 
Sfil. 

8. The vector according to claim 3 
wherein said replicon comprises a form of 
bacteriophage A* 

9. The vector according to claim 3 
further comprising regulatory elements that are 
located in relation to said site for insertion of 
DNA segments such that, when a DNA segment is 
inserted into said site, at least a portion of the 
sequence of said DNA segment is transcribed. 

10. The vector according to claim 9 
wherein said regulatory elements consist of 
promoters that entirely originate from 
bacteriophage. 

11. The vector according to claim 10 
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wherein said vector is either LaabdaGEM T "li or 
LambdaGEM™12 . 

12. The vector according to claim 9 
wherein said regulatory elements are at least partly 

5 of eukaryotic origin. 

13. The vector according to claim 12 
wherein said vector is either ApCEVis or ApCEV9. 
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15 



20 



25 



30 



35 



14. The vector according to claim 3 
further comprising a selectable marker that is 
functional in eukaryotic cells in which the vector 
can be replicated. 

15. A method for cloning a cDNA copy of a 
eukaryotic torn A, comprising the steps of: 

(i) annealing a linker-primer DNA 
segment comprising a single-stranded oligonucleotide 
which comprises oligo(dT) at the 3- end, and a 
single-stranded extension at the 5' end that is 
included in a first non-symmetrical restriction 
enzyme recognition sequence; 

(ii) enzymatically synthesizing the 
first strand of said cDNA from the linker-primer 
that is annealed with said mRNA molecule; 

(iii) enzymatically synthesizing the 
second strand of said cDNA using said first strand 
as the template under conditions such that 
single-stranded extensions on the synthesized cDNA 
molecule are made double-stranded; 

(iv) ligating onto the blunt-ended 
cDNA resulting from synthesizing said second strand, 
an adaptor DNA segment comprising a second non- 
symmetrical restriction enzyme recognition sequence 
that is nonidentical to said first non-symmetrical 
restriction enzyme recognition sequence; 
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(v) exposing said cDNA resulting 
from ligation with said adaptor to one or more 
restriction enzymes that can cleave said first and 
second non-symmetrical restriction enzyme 
recognition sequences under conditions such that 
both of said sequences are cleaved, resulting in 
said vector DNA having two single-stranded ends that 
are not complementary; 

(vi) ligating said cDNA resulting 
from cleavage with said enzymes to DNA of a genetic 
cloning vector, said vector comprising 

at least one replicon; and 
a site for inserting DNA segments to 
be cloned that includes at least two non-symmetrical 
restriction enzyme recognition sequences, wherein 

in said DNA of said vector, at least 
two of said non-symmetrical restriction enzyme 
recognition sequences have been cleaved by one or 
more enzymes that can cleave said recognition 
sequences, resulting in said vector DNA having two 
single-stranded ends that are not complementary; 
wherein further, 

one of said single-stranded ends on 
said vector DNA that has been cleaved has a sequence 
that is complementary to the single-stranded 
extension on said linker-primer attached to said 
cDNA; and 

the other of said single-stranded 
ends on said vector DNA that has been cleaved has a 
sequence that is complementary to. the single- 
stranded extension on said adaptor attached to said 
cDNA; and 

(vii) transforming a suitable host 
cell with the recombinant DNA segment comprising 
said cDNA and said vector DNA that results from said 
ligation of cDNA to vector DNA; and 



: 80 

(viii) identifying a clone of host 
cells , resulting from transformation with said 
recombinant DNA, that contains a recombinant DNA 
segment including said cDNA. 

16. DNA segment having the sequence of 

a genetic cloning* vector comprising 

at least one replicon; and 

a site for inserting DNA segments to 

be cloned that includes at least two nonidentical 

restriction enzyme recognition sequences that are 

non-symmetrital, 

■both of said nonidentical restriction 
enzyme recognition sequences having been cleaved by 
an enzyme or v enzymes that can cleave them. 

17... A reagent kit comprising a vector DNA 
segment according to claim 16 and further including: 

a linker-primer having a single- 
stranded end . ti»t is complementary to one single- 
stranded end* of said vector DNA; and 

an adaptor which after cleavage by a 
suitable restriction enzyme, has a single-stranded 
end that is complementary to the other single- 
stranded end, of ^aid vector DNA. 

18. A genetic cloning vector comprising 

at. least one replicon; 

a site for inserting a DNA segment to 
be cloned that includes at least two non-symmetrical 
restriction enzyme recognition sequences that can be 
cleaved by a single restriction enzyme; and 

at least two regulatory elements 
located, in relation to said site for insertion such 
that, when a DNA segment is inserted into said site, 
transcription of said DNA segment can be effected in 
both the seiise and antisense directions. 
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19. The cloning vector according to claim 
18 wherein said cloning vector is a phagemid and 
wherein said at least one replicon comprises at 
least one plasmid replicon and at least one phage 
5 replicon • 



20. The cloning vector according to claim 
19 further comprising a single stranded phage origin 
of replication. 

10 

21. The cloning vector according to claim 
18 wherein said vector is pCEV27. 

22. A DNA segment having a sequence that 
15 encodes the amino acid sequence shown in Figure 9b. 

23. The DNA segment according to claim 22 
wherein said segment has the sequence shown in 
Figure 9b. 

20 

24. A DNA segment encoding a keratinocyte 
growth factor receptor. 

25. The DNA segment according to claim 24 
25 wherein said receptor has the amino acid sequence 

given in Figure 15a, or allelic or species 
variations thereof. 



30 



26. A DNA construct comprising a vector 
and a DNA segment encoding a keratinocyte growth 
factor receptor. 



27. a DNA construct comprising a vector 
and a DNA segment encoding a keratinocyte growth 
35 factor receptor wherein the vector is the cloning 
vector according to claim 18. 
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28. The construct according to claim 26 
wherein the vector is pCEV27. 

29. A host cell comprising the construct 
5 according to claim 26. 



3<£. A keratinocyte growth factor receptor 
substantially free of proteins with which it is 
naturally associated. 

10 , 

31. The receptor according to claim 30 
wherein said receptor has a higher affinity for 
keratinocyte growth factor and acidic fibroblast 
growth factor than basic fibroblast growth factor. 

15 

32. The receptor according to claim 31 
wherein said receptor has the sequence shown in 
Figure 15a. 

20 33., A process of producing a keratinocyte 

growth factor receptor comprising culturing the 
cells according to claim 29 under conditions such 
that said DNA segment is expressed and said factor 
thereby produced. 

25 

34. A method of identifying the presence 
in a DNA sequence of a gene the protein product of 
which confers a phenotypically identifiable trait 
comprising: 

30 i) preparing a DNA expression 

library containing said DNA sequence in said cloning 
vector according to claim 9 or 18, in a manner such 
that said gene is represented in a said library; 

ii) introducing said library into a 

35 population of cells under conditions such that 
integration into the genome of said cells and 
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expression of said gene are effected , so that said 
phenotypically identifiable trait is caused to be 
displayed; 

iii) isolating DNA from said cells of 
said population displaying said phenotypically 
identifiable trait; and 

iv) excising a DNA segment 
containing said gene from said integrated DNA. 

35. The method according to claim 34 
wherein said DNA sequence is a cDNA sequence. 

36. The method according to claim 34 
wherein said gene encodes a ligand a receptor for 
which is normally produced by cells of said 
population, said cells, prior to introduction of 
said gene, being incapable of producing said ligand. 

37. The method according to claim 34 
wherein said gene encodes a receptor, the ligand 
which binds thereto being normally produced by cells 
of said population, said cells, prior to 
introduction of said gene, being incapable of 
producing said receptor. 

38. The method according to claim 34 
wherein said phenotypically identifiable trait is 
uncontrolled proliferation. 

39. The method according to claim 37 
wherein said receptor is the keratinocyte growth 
factor receptor. 

40. The method according to claim 34 
wherein said phenotypically identifiable trait is 
drug resistance. 
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Fig. 3 
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Fig. 4 
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